Web3

Home Web3 Page 203

Enhancing RAG Context Recall with a Custom Embedding Model: Guide

0
Enhancing RAG Context Recall with a Custom Embedding Model: Guide


Retrieval-augmented generation (RAG) has become a go-to approach for integrating large language models (LLMs) into specialized business applications, allowing proprietary data to be directly infused into the model’s responses. However, as powerful as RAG is during the proof of concept (POC) phase, developers frequently encounter significant accuracy drops when deploying it into production. This issue is especially noticeable during the retrieval phase, where the goal is to accurately retrieve the most relevant context for a given query—a metric often referred to as context recall.

This guide focuses on how to improve context recall by customizing and fine-tuning an embedding model. We’ll explore embedding models, how to prepare a dataset tailored to your needs, and specific steps for training and evaluating your model, all of which can significantly enhance RAG’s performance in production. Here’s how to refine your embedding model and boost your RAG context recall by over 95%.

What is RAG and Why Does it Struggle in Production?

RAG consists of two primary steps: retrieval and generation. During retrieval, the model fetches the most relevant context by converting the text into vectors, indexing, retrieving, and re-ranking these vectors to select the top matches. In the generation stage, this retrieved-context is combined with prompts, which are then sent to the LLM to generate responses. Unfortunately, the retrieval phase often fails to retrieve all relevant contexts, causing drops in context recall and leading to less accurate generation outputs.

One solution is adapting the embedding model—a neural network designed to understand the relationships between text data—so it produces embeddings that are highly specific to your dataset. This fine-tuning enables the model to create similar vectors for similar sentences, allowing it to retrieve contexts that are more relevant to the query.

Understanding Embedding Models

Embedding models extend beyond simple word vectors, offering sentence-level semantic understanding. For instance, embedding models trained with techniques such as masked language modeling learn to predict masked words within a sentence, giving them a deep understanding of language structure and context. These embeddings are often optimized using distance metrics like cosine similarity to prioritize and rank the most relevant contexts during retrieval.

For example, an embedding model might generate similar vectors for these sentences:

Even though they describe different things, they both relate to the theme of color and nature, so they are likely to have a high similarity score.

For RAG, high similarity between a query and relevant context ensures accurate retrieval. Let’s examine a practical case where we aim to improve this similarity for better results.

Customizing the Embedding Model for Enhanced Context Recall

To significantly improve context recall, we adapt the embedding model to our specific dataset, making it better suited to retrieve relevant contexts for any given query. Rather than training a new model from scratch, which is resource-intensive, we fine-tune an existing model on our proprietary data.

Why Not Train from Scratch?

Starting from scratch isn’t necessary because most embedding models are pre-trained on billions of tokens and have already learned a substantial amount about language structures. Fine-tuning such a model to make it domain-specific is far more efficient and ensures quicker, more accurate results.

Step 1: Preparing the Dataset

A customized embedding model requires a dataset that closely mirrors the kind of queries it will encounter in real use. Here’s a step-by-step breakdown:

Training Set Preparation

Mine Questions: Extract a wide range of questions related to your knowledge base using the LLM. If your knowledge base is extensive, consider chunking it and generating questions for each chunk.

Paraphrase for Variability: Paraphrase each question to expand your training dataset, helping the model generalize better across similar queries.

Organize by Relevance: Assign each question a corresponding context that directly addresses it. The aim is to ensure that during training, the model learns to associate specific queries with the most relevant information.

Testing Set Preparation

Sample and Refine: Create a smaller test set by sampling real user queries or questions that may come up in practice. This testing set helps ensure that your model performs well on unseen data.

Include Paraphrased Variations: Add slight paraphrases of the test questions to help the model handle different phrasings of similar queries.

For this example, we’ll use the “PubMedQA” dataset from Hugging Face, which contains unique publication IDs (pubid), questions, and contexts. Here’s a sample code snippet for loading and structuring this dataset:

from datasets import load_dataset
med_data = load_dataset(“qiaojin/PubMedQA”, “pqa_artificial”, split=“train”)

med_data = med_data.remove_columns([‘long_answer’, ‘final_decision’])
df = pd.DataFrame(med_data)
df[‘contexts’] = df[‘context’].apply(lambda x: x[‘contexts’])
expanded_df = df.explode(‘contexts’)
expanded_df.reset_index(drop=True, inplace=True)
splitted_dataset = Dataset.from_pandas(expanded_df[[‘question’, ‘contexts’]])

Step 2: Constructing the Evaluation Dataset

To assess the model’s performance during fine-tuning, we prepare an evaluation dataset. This dataset is derived from the training set but serves as a realistic representation of how well the model might perform in a live setting.

Generating Evaluation Data

From the PubMedQA dataset, select a sample of contexts, then use the LLM to generate realistic questions based on this context. For example, given a context on immune cell response in breast cancer, the LLM might generate questions like “How does immune cell profile affect breast cancer treatment outcomes?”

Each row of your evaluation dataset will thus include several context-question pairs that the model can use to assess its retrieval accuracy.

from openai import OpenAI

client = OpenAI(api_key=“”)

prompt = “””Your task is to mine questions from the given context.
{context} {example_question} “””

questions = []
for row in eval_med_data_seed:
context = “\n\n”.join(row[“context”][“contexts”])
completion = client.chat.completions.create(
model=“gpt-4o”,
messages=[
{“role”: “system”, “content”: “You are a helpful assistant.”},
{“role”: “user”, “content”: prompt.format(context=context, example_question=row[“question”])}
]
)
questions.append(completion.choices[0].message.content.split(“|”))

Step 3: Setting Up the Information Retrieval Evaluator

To gauge model accuracy in the retrieval phase, use an Information Retrieval Evaluator. The evaluator retrieves and ranks contexts based on similarity scores and assesses them using metrics like Recall@k, Precision@k, Mean Reciprocal Rank (MRR), and Accuracy@k.

Define Corpus and Queries: Organize the corpus (context information) and queries (questions from your evaluation set) into dictionaries.

Set Relevance: Establish relevance by linking each query ID with a set of relevant context IDs, which represents the contexts that ideally should be retrieved.

Evaluate: The evaluator calculates metrics by comparing retrieved contexts against relevant ones. Recall@k is a critical metric here, as it indicates how well the retriever pulls relevant contexts from the database.

from sentence_transformers import InformationRetrievalEvaluator

ir_evaluator = InformationRetrievalEvaluator(
queries=eval_queries,
corpus=eval_corpus,
relevant_docs=eval_relevant_docs,
name=“med-eval-test”,
)

Step 4: Training the Model

Now we’re ready to train our customized embedding model. Using the sentence-transformer library, we’ll configure the training parameters and utilize the MultipleNegativeRankingLoss function to optimize similarity scores between queries and positive contexts.

Training Configuration

Set the following training configurations:

Training Epochs: Number of training cycles.

Batch Size: Number of samples per training batch.

Evaluation Steps: Frequency of evaluation checkpoints.

Save Steps and Limits: Frequency and total limit for saving the model.

from sentence_transformers import SentenceTransformer, losses

model = SentenceTransformer(“stsb-distilbert-base”)
train_loss = losses.MultipleNegativesRankingLoss(model=model)

trainer = SentenceTransformerTrainer(
model=model, args=args,
train_dataset=splitted_dataset[“train”],
eval_dataset=splitted_dataset[“test”],
loss=train_loss,
evaluator=ir_evaluator
)

trainer.train()

Results and Improvements

After training, the fine-tuned model should display significant improvements, particularly in context recall. In testing, fine-tuning showed an increase in:

Recall@1: 78.8%

Recall@3: 137.9%

Recall@5: 116.4%

Recall@10: 95.1%

Such improvements mean that the retriever can pull more relevant contexts, leading to a substantial boost in RAG accuracy overall.

Final Notes: Monitoring and Retraining

Once deployed, monitor the model for data drift and periodically retrain as new data is added to the knowledge base. Regularly assessing context recall ensures that your embedding model continues to retrieve the most relevant information, maintaining RAG’s accuracy and reliability in real-world applications. By following these steps, you can achieve high RAG accuracy, making your

model robust and production-ready.

FAQs

What is RAG in machine learning?RAG, or retrieval-augmented generation, is a method that retrieves specific information to answer queries, improving the accuracy of LLM outputs.

Why does RAG fail in production?RAG often struggles in production because the retrieval step may miss critical context, resulting in poor generation accuracy.

How can embedding models improve RAG performance?Fine-tuning embedding models to a specific dataset enhances retrieval accuracy, improving the relevance of retrieved contexts.

What dataset structure is ideal for training embedding models?A dataset with varied queries and relevant contexts that resemble real queries enhances model performance.

How frequently should embedding models be retrained?Embedding models should be retrained as new data becomes available or when significant accuracy dips are observed.



Source link

Apple Admits to Security Vulnerability That Leaves Crypto Users Exposed—Here’s What You Should Do – Decrypt

0
Apple Admits to Security Vulnerability That Leaves Crypto Users Exposed—Here’s What You Should Do – Decrypt



Apple confirmed Monday its devices were left vulnerable to an exploit that allowed for remote malicious code execution through web-based JavaScript, opening up an attack vector that could have part unsuspecting victims from their crypto.

According to a recent Apple security disclosure, users must use the latest versions of its JavaScriptCore and WebKit software to patch the vulnerability. 

The bug, discovered by researchers at Google’s threat analysis group, allows for “processing maliciously crafted web content,” which could lead to a “cross-site scripting attack.”

More alarmingly, Apple also admitted it “is aware of a report that this issue may have been actively exploited on Intel-based Mac systems.”

Apple also issued a similar security disclosure for iPhone and iPad users. Here, it says, the JavaScriptCore vulnerability allowed for “processing maliciously crafted web content may lead to arbitrary code execution.” 

In other words, Apple became aware of a security flaw that could let hackers take control of a user’s iPhone or iPad if they visit a harmful website. An update should solve the issue, Apple said.

Jeremiah O’Connor, CTO and co-founder of crypto cybersecurity firm Trugard, told Decrypt that “attackers could access sensitive data like private keys or passwords” stored in their browser, enabling crypto theft if the user’s device remained unpatched.

Revelations of the vulnerability within the crypto community began circulating on social media on Wednesday, with former Binance CEO Changpeng Zhao raising the alarm in a tweet advising that users of Macbooks with Intel CPUs should update as soon as possible.

The development follows March reports that security researchers have discovered a vulnerability in Apple’s previous generation chips—its M1, M2, and M3 series that could let hackers steal cryptographic keys.

The exploit, which isn’t new, leverages “prefetching,” a process used by Apple’s own M-series chips to speed up interactions with the company’s devices. Prefetching can be exploited to store sensible data in the processor’s cache and then access it to reconstruct a cryptographic key that is supposed to be inaccessible.

Unfortunately, ArsTechnica reports that this is a significant issue for Apple users since a chip-level vulnerability can not be solved through a software update. 

A potential workaround can alleviate the problem, but those trade performance for security.

Edited by Stacy Elliott and Sebastian Sinclair

Daily Debrief Newsletter

Start every day with the top news stories right now, plus original features, a podcast, videos and more.



Source link

Crypto in the Metaverse: Forging a Revolutionary Digital Economy – Web3oclock

0
Crypto in the Metaverse: Forging a Revolutionary Digital Economy – Web3oclock


What are Metaverse Cryptocurrencies?

The Use of Cryptocurrencies in Virtual Worlds

What are Metaverse Cryptocurrencies?

The Use of Cryptocurrencies in Virtual Worlds:

1. Facilitating Virtual Commerce:

2. Buying and Selling Virtual Real Estate:

3. Governance and Voting:

4. Revolutionizing Gaming:

5. Reward Mechanisms and Microtransactions:

6. Tokenizing Digital Art and Collectibles:

7. Royalties and Fair Compensation:

8. Reducing Volatility with Stablecoins:

9. Cross-Platform Interoperability:



Source link

Nextrope Partners with Hacken to Enhance Blockchain Security – Nextrope – Your Trusted Partner for Blockchain Development and Advisory Services

0
Nextrope Partners with Hacken to Enhance Blockchain Security – Nextrope – Your Trusted Partner for Blockchain Development and Advisory Services


The 33rd Economic Forum 2024, held in Karpacz, Poland, gathered leaders from across the globe to discuss the pressing economic and technological challenges. This year, the forum had a special focus on Artificial Intelligence (AI and Cybersecurity, bringing together leading experts and policymakers.

Nextrope was proud to participate in the Forum where we showcased our expertise and networked with leading minds in the AI and blockchain fields.

Economic Forum 2024: A Hub for Innovation and Collaboration

The Economic Forum in Karpacz is an annual event often referred to as the “Polish Davos,” attracting over 6,000 participants, including heads of state, business leaders, academics, and experts. This year’s edition was held from September 3rd to 5th, 2024.

Key Highlights of the AI Forum and Cybersecurity Forum

The AI Forum and the VI Cybersecurity Forum were integral parts of the event, organized in collaboration with the Ministry of Digital Affairs and leading Polish universities, including:

Cracow University of Technology

University of Warsaw

Wrocław University of Technology

AGH University of Science and Technology

Poznań University of Technology

Objectives of the AI Forum

Promoting Education and Innovation: The forum aimed to foster education and spread knowledge about AI and solutions to enhance digital transformation in Poland and CEE..

Strengthening Digital Administration: The event supported the Ministry of Digital Affairs’ mission to build and strengthen the digital administration of the Polish State, encouraging interdisciplinary dialogue on decentralized architecture.

High-Level Meetings: The forum featured closed meetings of digital ministers from across Europe, including a confirmed appearance by Volker Wissing, the German Minister for Digital Affairs.

Nextrope’s Active Participation in the AI Forum

Nextrope’s presence at the AI Forum was marked by our active engagement in various activities in the Cracow University of Technology and University of Warsaw zone. One of the discussion panels we enjoyed the most was “AI in education – threats and opportunities”.

Our Key Activities

Networking with Leading AI and Cryptography Researchers.

Nextrope presented its contributions in the field of behavioral profilling in DeFi and established relationships with Cryptography Researchers from Cracow University of Technology and the brightest minds on Polish AI scene, coming from institutions such as Wroclaw University of Technology, but also from startups.

Panel Discussions and Workshops

Our team participated in several panel discussions, covering a variety of topics. Here are some of them

Polish Startup Scene.

State in the Blockchain Network

Artificial Intelligence – Threat or Opportunity for Healthcare?

Silicon Valley in Poland – Is it Possible?

Quantum Computing – How Is It Changing Our Lives?

Broadening Horizons

Besides tuning in to topics that strictly overlap with our professional expertise we decided to broaden our horizons and participated in panels about national security and cross-border cooperation.

Meeting with clients:

We had a pleasure to deepen relationships with our institutional clients and discuss plans for the future.

Networking with Experts in AI and Blockchain

A major highlight of the Economic Forum in Karpacz was the opportunity to network with experts from academia, industry, and government.

Collaborations with Academia:

We engaged with scholars from leading universities such as the Cracow University of Technology and the University of Warsaw. These interactions laid the groundwork for potential research collaborations and joint projects.

Building Strategic Partnerships:

Our team connected with industry leaders, exploring opportunities for partnerships in regard to building the future of education. We met many extremely smart, yet humble people interested in joining advisory board of one of our projects – HackZ.

Exchanging Knowledge with VCs and Policymakers:

We had fruitful discussions with policymakers and very knowledgable representatives of Venture Capital. The discussions revolved around blockchain and AI regulation, futuristic education methods and dillemas regarding digital transformation in companies. These exchanges provided us with very interesting insights as well as new friendships.

Looking Ahead: Nextrope’s Future in AI and Blockchain

Nextrope’s participation in the Economic Forum Karpacz 2024 has solidified our position as one of the leading, deep-tech software houses in CEE. By fostering connections with academia, industry experts, and policymakers, we are well-positioned to consult our clients on trends and regulatory needs as well as implementing cutting edge DeFi software.

What’s Next for Nextrope?

Continuing Innovation:

We remain committed to developing cutting-edge software solutions and designing token economies that leverage the power of incentives and advanced cryptography.

Deepening Academic Collaborations:

The partnerships formed at the forum will help us stay at the forefront of technological advancements, particularly in AI and blockchain.

Expanding Our Global Reach:

The international connections made at the forum enable us to expand our influence both in CEE and outside of Europe. This reinforces Nextrope’s status as a global leader in technology innovation.

If you’re looking to create a robust blockchain system and go through institutional-grade testing please reach out to contact@nextrope.com. Our team is ready to help you with the token engineering process and ensure your project’s resilience in the long term.



Source link

Insider information: Positive profit warning – Trainers’ House updates its guidance for 2024 | Web3Wire

0
Insider information: Positive profit warning – Trainers’ House updates its guidance for 2024 | Web3Wire


TRAINERS’ HOUSE GROUP, INSIDER INFORMATION 21 NOVEMBER 2024 at 8:15

Trainers’ House’s year-end sales, order backlog and encounter marketing business have developed better than expected despite the continued difficult market environment.

Trainers’ House is raising its full-year profit guidance.

According to the updated guidance, the company estimates that the operating profit for 2024 will be between a loss of EUR 50 thousand and a profit of EUR 150 thousand.

In its financial statement release published earlier on February 22, 2024, the company estimated the operating profit for 2024 to be in negative.

TRAINERS’ HOUSE PLCBOARD OF DIRECTORS

Information:Arto Heimonen, CEO, +358 404 123 456Saku Keskitalo, CFO, +358 404 111 111

Insider information: Positive profit warning – Trainers’ House updates its guidance for 2024 | Web3Wire

About Web3Wire Web3Wire – Information, news, press releases, events and research articles about Web3, Metaverse, Blockchain, Artificial Intelligence, Cryptocurrencies, Decentralized Finance, NFTs and Gaming. Visit Web3Wire for Web3 News and Events, Block3Wire for the latest Blockchain news and Meta3Wire to stay updated with Metaverse News.



Source link

The New Search Engine Battle: Perplexity AI vs. Google

0
The New Search Engine Battle: Perplexity AI vs. Google


The search engine landscape is on the brink of another revolution. We’ve come a long way since the early days of Yahoo and MSN. Google redefined search by focusing on simplicity, precision, and user experience, making it the dominant search engine for nearly three decades. With an unprecedented market share—over 90% globally—and its own browser, Chrome, used by about 65% of internet users, Google seems almost irreplaceable. But even giants can face disruption.

Despite Google’s widespread use, the user’s fundamental goal has remained constant: finding the most accurate, efficient, and easy-to-access answers online. As Google ages, it increasingly relies on ad-centric strategies and SEO-dominated content to drive its business model. But now, advanced AI search engines like Perplexity are challenging Google’s methods, offering answers that align more closely with user intent.

Will Google adapt and thrive, or is it on the path to obsolescence? Let’s dig deeper.

Google’s Original Innovation and Why It Dominated the Market

When Google launched in 1998, it transformed search by focusing on simplicity, speed, and accuracy. Yahoo’s search results, for example, were embedded within a cluttered portal full of advertisements and other links, but Google’s clean, minimalist design prioritized the user’s need to find information quickly. The revolutionary PageRank algorithm introduced a way to rank pages by relevancy, vastly improving search quality.

Another key factor behind Google’s success was its innovative revenue model, AdWords, which leveraged targeted advertising. This model didn’t just generate profits; it gave Google the resources to maintain and expand its market presence. Google became synonymous with online search, leading to the phrase, “Just Google it,” cementing its place in digital culture.

However, over time, Google’s search structure has become more complex and, at times, cluttered. The foundational user intents—efficiency, accuracy, and simplicity—seem to have taken a back seat to revenue generation and SEO dominance.

How Google Has Fallen Short of User Expectations

While Google’s growth and success are undeniable, its evolution has left some user needs under-addressed. In a world where people search for everything from relationship advice to the best deal on flights, Google’s strategy often requires users to sift through multiple links and ads to find accurate answers. For instance, if you search for something specific, like the best hotel deals in New York, you might need to click on multiple ads, scroll through SEO-optimized content, and browse several pages to locate the right answer.

What users truly seek are three main things:

Efficiency – Quick and straightforward answers with minimal clicks.

Accuracy – Precise, relevant, and correct information.

Simplicity – A straightforward and easy-to-navigate interface.

While these expectations haven’t changed, Google’s response hasn’t kept pace. As a result, new technologies and platforms—like Perplexity AI—are stepping in to meet this demand.

Why Google’s Model Is Showing Its Age

The current design of Google’s search engine feels crowded and ad-heavy. A significant reason for this is Google’s dependence on SEO-driven content and an ad-centric revenue model. This has led to a compromised user experience, which raises friction for users and often results in a trust deficit. Google’s challenges stem from:

Ad-Centric Results: Prioritizing paid content can detract from the quality and relevance of search results.

SEO-Driven Influence: Results are often manipulated through SEO, which doesn’t always equate to the best answers.

User Journey Complexity: Users frequently need to explore multiple links and navigate through numerous pages to find direct answers.

As more users grow tired of these issues, AI-driven search engines are offering a new way to address user intent directly, without layers of ad-based distractions.

AI’s Impact: Solving Problems Google Couldn’t in 1998

AI advancements are making it possible to fulfill user intent more efficiently than ever. Thanks to Natural Language Processing (NLP), neural networks, and large language models, new platforms are bridging gaps Google struggles to address. This shift has been fueled by:

Natural Language Processing (NLP): Enabling AI to interpret human language accurately and contextually.

Neural Networks and Large Language Models (LLMs): Equipping systems to process extensive data and offer conversational responses.

Conversational Interfaces: Allowing users to receive answers in dialogue format, reducing the need for multiple searches and clicks.

AI-driven platforms like Perplexity and ChatGPT utilize these innovations to deliver concise, contextually relevant answers without the distractions of ads or SEO-optimized—but sometimes irrelevant—content.

Perplexity AI vs. Google Search: What Are Users Actually Looking For?

Unlike Google, which provides a list of links based on relevance and popularity, Perplexity AI offers answers in a conversational format. If you ask, “How many years since Google was founded?” Perplexity delivers the answer directly rather than leading you to a page full of links. Users like this straightforward approach, which saves time and reduces frustration.

Here’s a comparative chart that captures key differences between Perplexity AI and Google Search:

FeaturePerplexity AIGoogle Search

Primary FunctionAI-driven conversational search engineTraditional search engine with ranked results

Search MechanismProvides conversational responses and summarized answersShows a list of webpages ordered by relevance

Response TypeGenerates direct answers with supporting sourcesRelies on snippets with links to relevant websites

Contextual Follow-upAllows for context-based follow-up questionsGenerally requires rephrasing or new searches for follow-ups

Sources DisplayedCites sources explicitly in responsesLists sources as separate results, typically no in-text citations

User ExperienceMore interactive and conversationalLinear list-style results

StrengthsQuick, detailed answers; good for deep dives and direct knowledgeWide-ranging information; established, comprehensive index

WeaknessesMay miss specialized niche informationCan overwhelm with irrelevant or overly general results

Ideal Use CasesQuick research, summarization, specific queriesBroad information gathering, varied resources

Response CustomizationCan adapt answers based on prior queriesLacks adaptive, context-driven responses

Comparing the usage stats (as of mid-2024):

Google holds a commanding market share of around 90%, with over 4.3 billion users.

ChatGPT claims roughly 9.81% of users, while Perplexity has grown to capture 0.22% of the market within just two years.

This growth, despite Perplexity’s new entry into the market, demonstrates that some users are already seeking alternative solutions to Google.

A Historical Analogy: The Shift from Canals to Railroads

Google’s challenge is reminiscent of the early 19th-century shift from canals to railroads. Canals once dominated transportation, revolutionizing how goods were moved. They were initially efficient, reliable, and widely used. But when railroads emerged, they proved faster, more versatile, and usable year-round.

Railroads drastically reduced travel times, could be constructed over more diverse terrains, and operated even in winter, while canals froze over. Despite these advantages, many canal companies failed to anticipate the disruption caused by railroads, clinging to their established ways and ultimately facing obsolescence.

Similarly, Google, with its longstanding search model, may need to adapt quickly to remain competitive in the evolving digital world.

Google’s AI Overview: A Response to Competitors

To counter the competition, Google has started incorporating AI features, like the AI Overview section, which answers “who,” “what,” “when,” “why,” and “how” questions directly within search results. However, this feature has its limitations:

Limited Activation: The AI Overview is only available for certain question formats, meaning it doesn’t appear for all search types.

Inconsistent Accuracy: At times, the answers provided can be off-target or incomplete, leading users back to browsing multiple links for clarity.

In comparison, Perplexity consistently provides accurate answers without constraints, making it more user-friendly for those seeking precise, immediate information. Google’s current approach may not be enough to address the real demands of modern users.

Can Google Overcome Its Business Model’s Conflict with User Intent?

Google’s primary revenue comes from advertising, a model that incentivizes ad exposure over user satisfaction. This conflict has created an opening for platforms like Perplexity AI, which originally offered an ad-free, user-centered experience. However, as Perplexity grows, it too faces sustainability challenges. While an ad-free model is attractive, it’s also costly to maintain.

To address this, Perplexity’s CEO has announced plans to introduce ads, though in a user-centric way, such as placing native ads within related question prompts or offering brand-sponsored questions. This strategy aims to provide value without compromising the user experience, potentially making it a sustainable revenue stream for the platform.

Revenue-Sharing and Content Partnerships: Perplexity’s Innovative Approach

Perplexity is also introducing a revenue-sharing model with content publishers. When the platform uses and cites articles from sources like Time, Fortune, or Der Spiegel, it shares a portion of the ad revenue with these publishers. This collaborative approach benefits content creators, boosts Perplexity’s credibility, and ensures the sustainability of quality information.

What Lies Ahead: Questions for Google and Perplexity’s Leaders

Both Google and Perplexity face critical decisions. Google must weigh the value of its ad-based model against user satisfaction, while Perplexity must sustain its growth without losing its user-first philosophy. Key questions include:

Would users pay for a premium, ad-free experience? Google could offer a paid, ad-free search tier if demand warrants it.

Can Perplexity and other AI platforms capture more market share before Google fully adapts to meet modern needs?

How long can Google rely on its traditional methods before seeing a noticeable user shift toward AI-driven solutions like Perplexity?

Conclusion: Will Google Evolve or Risk Becoming Obsolete?

The comparison between Google and Perplexity is a stark reminder of how technology can disrupt even the most established companies. Like canals that were eventually overshadowed by railroads, Google risks being outpaced by faster, more adaptable technologies that better address user needs.

Google has the tools to adapt by integrating advanced AI more thoroughly into its platform, prioritizing user intent, and exploring alternative revenue streams. But if it remains anchored to an ad-heavy, SEO-influenced model, it may find itself in a battle with platforms like Perplexity that continue to evolve and grow.

The future of search is uncertain, but one thing is clear: the platforms that prioritize user needs while finding sustainable monetization strategies are the ones that will define the next era of search technology.

**FAQs

**

1. Can Perplexity AI fully replace Google?

Not entirely. Google has a vast user base and extensive services beyond search. However, Perplexity could become a preferred choice for users seeking quick, accurate answers without the ad-heavy experience.

2. Will Google offer an ad-free search experience?

It’s possible. Google may consider offering a paid, premium ad-free version if demand for a simpler, ad-free search grows.

3. How does Perplexity AI provide direct answers?

Using advanced AI, Perplexity interprets natural language queries and delivers conversational answers, minimizing the need for users to sift through multiple links.

4. Are ads necessary for AI platforms to be sustainable?

For many AI platforms, ads are a practical revenue source. When implemented thoughtfully, ads can support the platform without detracting from user experience.

5. What other AI platforms are challenging Google?

ChatGPT, Claude, and other AI-driven platforms are emerging as alternatives, offering conversational, direct answers that appeal to users looking for efficient search solutions.



Source link

Facebook’s Vision of the Metaverse: A New Digital Frontier – Web3oclock

0
Facebook’s Vision of the Metaverse: A New Digital Frontier – Web3oclock


What is the Metaverse According to Meta?

Key Pillars of Meta’s Metaverse Initiatives

The Technologies Behind Meta’s Metaverse

Potential Challenges and Criticisms

1. Horizon Worlds:

2. Horizon Workrooms:

3. Horizon Venues:

4. Oculus and Next-Gen VR/AR Hardware:

5. Meta Avatars and AI-Powered Personalization:

6. Spark AR and Augmented Reality Initiatives:

7. Virtual Economy and Digital Commerce:

8. Project Cambria (High-End VR Headset):

The Technologies Behind Meta’s Metaverse:

1. Virtual Events: 

Potential Challenges and Criticisms of Meta:



Source link

Grayscale to Launch Bitcoin ETF Options Following BlackRock’s Record Debut – Decrypt

0
Grayscale to Launch Bitcoin ETF Options Following BlackRock’s Record Debut – Decrypt



Crypto asset manager Grayscale Investments plans to roll out options trading on its spot Bitcoin ETFs on Wednesday amid the first glimpses of solid investor appetite for such products.

The announcement comes a day after BlackRock’s iShares Bitcoin Trust (IBIT) achieved record-breaking activity on its first day of options trading, pushing Bitcoin to a new all-time high.

Grayscale will launch options trading on GBTC (Grayscale Bitcoin Trust) and BTC (Bitcoin Mini Trust) to “further [develop] the ecosystem around our US-listed Bitcoin ETPs,” it said.

Following the Options Clearing Corporation’s (OCC) approval of Bitcoin ETF options, Grayscale quickly filed an updated prospectus for its Bitcoin Covered Call ETF on January 11.

The tooling aims to generate income by employing a covered call strategy—writing and buying options contracts on Bitcoin exchange-traded products (ETPs) while holding Bitcoin or GBTC as collateral.

Bloomberg ETF analyst Seyffart called attention to the speed of Grayscale’s response following the OCC’s clearance, tweeting Tuesday that the asset manager was “wasting no time.”

“They’ve filed an updated prospectus for their Bitcoin Covered Call ETF,” Seyffart tweeted. “The fund will offer exposure to $GBTC & $BTC while writing &/or buying options contracts on Bitcoin ETPs for income.”

Grayscale follows the unprecedented debut of BlackRock’s IBIT options, which recorded nearly $1.9 billion in notional exposure traded on its first day

Seyffart shared details on X, observing that 354,000 contracts were exchanged, including 289,000 calls and 65,000 puts, representing a 4.4:1 call-to-put ratio.

The ratio indicates that a significantly larger number of investors placed bets on Bitcoin’s price rise (calls) compared to those hedging against a potential price drop (puts). 

“These options were almost certainly part of the move to the new Bitcoin all-time highs today,” Seyffart wrote, referring to Bitcoin’s surge to $94,041 on Tuesday.

Bloomberg’s senior ETF analyst Eric Balchunas characterized the $1.9 billion trading volume as “unheard of” for any given options trading within an ETF during its first day.

“$For context, BITO did $363 million, and that’s been around for four years,” Balchunas wrote on X, referring to ProShares’ futures Bitcoin ETF.

Roughly 73,000 options contracts were traded in the first 60 minutes, placing IBIT among the top 20 most active non-index options on its opening day.

Grayscale’s launch comes a year after its major legal victory against the SEC. Last August, the U.S. Court of Appeals ordered the SEC to revisit its denial of Grayscale’s application to convert its Bitcoin Trust into a spot ETF.

This ruling was a turning point for crypto ETFs, challenging regulatory resistance that had stalled their approvals for nearly a decade.

Edited by Sebastian Sinclair

Daily Debrief Newsletter

Start every day with the top news stories right now, plus original features, a podcast, videos and more.



Source link

Step-by-Step Guide to Creating an LLM-Based App for Chat with Papers

0
Step-by-Step Guide to Creating an LLM-Based App for Chat with Papers


Staying updated with the latest in machine learning (ML) research can feel overwhelming. With the steady stream of papers on large language models (LLMs), vector databases, and retrieval-augmented generati on (RAG) systems, it’s easy to fall behind. But what if you could access and query this vast research library using natural language? In this guide, we’ll create an AI-powered assistant that mines and retrieves information from Papers With Code (PWC), providing answers based on the latest ML papers.

Our app will use a RAG framework for backend processing, incorporating a vector database, VertexAI’s embedding model, and an OpenAI LLM. The frontend will be built on Streamlit, making it simple to deploy and interact with.

Step 1: Data Collection from Papers With Code

Papers With Code is a valuable resource that aggregates the latest ML papers, source code, and datasets. To automate data retrieval from this site, we’ll use the PWC API. This allows us to collect papers related to specific keywords or topics.

Retrieving Papers Using the API

To search for papers programmatically:

Access the PWC API Swagger UI and locate the papers/ endpoint.

Use the q parameter to enter keywords for the topic of interest.

Execute the query to retrieve data.

Each response includes the first set of results, with additional pages accessible via the next key. To retrieve multiple pages, you can set up a function that loops through all pages based on the initial result count. Here’s a Python script to automate this:

import requests
import urllib.parse
from tqdm import tqdm

def extract_papers(query: str):
query = urllib.parse.quote(query)
url = f”https://paperswithcode.com/api/v1/papers/?q={query}
response = requests.get(url).json()
count = response[“count”]
results = response[“results”]

num_pages = count // 50
for page in tqdm(range(2, num_pages)):
url = f”https://paperswithcode.com/api/v1/papers/?page={page}&q={query}
response = requests.get(url).json()
results.extend(response[“results”])
return results

query = “Large Language Models”
results = extract_papers(query)
print(len(results))

Formatting Results for LangChain Compatibility

Once extracted, convert the data to LangChain-compatible Document objects. Each document will contain:

page_content: stores the paper’s abstract.

metadata: includes attributes like id, arxiv_id, url_pdf, title, authors, and published.

from langchain.docstore.document import Document

documents = [
Document(
page_content=result[“abstract”],
metadata={
“id”: result.get(“id”, “”),
“arxiv_id”: result.get(“arxiv_id”, “”),
“url_pdf”: result.get(“url_pdf”, “”),
“title”: result.get(“title”, “”),
“authors”: result.get(“authors”, “”),
“published”: result.get(“published”, “”)
},
)
for result in results
]

Chunking for Efficient Retrieval

Since LLMs have token limitations, breaking down each document into chunks can improve retrieval and precision. Using LangChain’s RecursiveCharacterTextSplitter, set chunk_size to 1200 characters and chunk_overlap to 200. This will generate manageable text chunks for optimal LLM input.

text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1200,
chunk_overlap=200,
separators=[“.”]
)
splits = text_splitter.split_documents(documents)
print(len(splits))

Step 2: Creating an Index with Upstash

To store embeddings and document metadata, set up an index in Upstash, a serverless database ideal for our project. After logging into Upstash, set your index parameters:

Region: closest to your location.

Dimensions: 768, matching VertexAI’s embedding dimension.

Distance Metric: cosine similarity.

Then, install the upstash-vector package:

pip install upstash-vector

Use the credentials generated by Upstash (URL and token) to connect to the index in your app.

from upstash_vector import Index

index = Index(
url=“<UPSTASH_URL>”,
token=“<UPSTASH_TOKEN>”
)

Step 3: Embedding and Indexing Documents

To add documents to Upstash, we’ll create a class UpstashVectorStore which embeds document chunks and indexes them. This class will include methods to:

from typing import List, Optional, Tuple, Union
from uuid import uuid4
from langchain.docstore.document import Document
from langchain.embeddings.base import Embeddings
from tqdm import tqdm
from upstash_vector import Index

class UpstashVectorStore:
def __init__(self, index: Index, embeddings: Embeddings):
self.index = index
self.embeddings = embeddings

def add_documents(
self,
documents: List[Document],
batch_size: int = 32
):

texts, metadatas, all_ids = [], [], []

for document in tqdm(documents):
texts.append(document.page_content)
metadatas.append({“context”: document.page_content, **document.metadata})

if len(texts) >= batch_size:
ids = [str(uuid4()) for _ in texts]
all_ids += ids
embeddings = self.embeddings.embed_documents(texts)
self.index.upsert(vectors=zip(ids, embeddings, metadatas))
texts, metadatas = [], []

if texts:
ids = [str(uuid4()) for _ in texts]
all_ids += ids
embeddings = self.embeddings.embed_documents(texts)
self.index.upsert(vectors=zip(ids, embeddings, metadatas))
print(f”Indexed {len(all_ids)} vectors.”)
return all_ids

def similarity_search_with_score(
self, query: str, k: int = 4
) -> List[Tuple[Document, float]]:

query_embedding = self.embeddings.embed_query(query)
results = self.index.query(query_embedding, top_k=k, include_metadata=True)
return [(Document(page_content=metadata.pop(“context”), metadata=metadata), score)
for metadata, score in results]

To execute this indexing:

from langchain.embeddings import VertexAIEmbeddings

embeddings = VertexAIEmbeddings(model_name=“textembedding-gecko@003”)
upstash_vector_store = UpstashVectorStore(index, embeddings)
ids = upstash_vector_store.add_documents(splits, batch_size=25)

Step 4: Querying Indexed Papers

With the abstracts indexed in Upstash, querying becomes straightforward. We’ll define functions to:

Retrieve relevant documents.

Build a prompt using these documents for LLM responses.

def get_context(query, vector_store):
results = vector_store.similarity_search_with_score(query)
return “\n===\n”.join([doc.page_content for doc, _ in results])

def get_prompt(question, context):
template = “””
Use the provided context to answer the question accurately.

%CONTEXT%
{context}

%Question%
{question}

Answer:
“””
return template.format(question=question, context=context)

For example, if you ask about the limitations of RAG frameworks:

query = “What are the limitations of the Retrieval Augmented Generation framework?”
context = get_context(query, upstash_vector_store)
prompt = get_prompt(query, context)

Step 5: Building the Application with Streamlit

To make our app user-friendly, we’ll use Streamlit for a simple, interactive UI. Streamlit makes it easy to deploy ML-powered web apps with minimal code.

import streamlit as st
from langchain.chat_models import AzureChatOpenAI

st.title(“Chat with ML Research Papers”)
query = st.text_input(“Ask a question about ML research:”)

if st.button(“Submit”):
if query:
context = get_context(query, upstash_vector_store)
prompt = get_prompt(query, context)
llm = AzureChatOpenAI(model_name=“<MODEL_NAME>”)
answer = llm.predict(prompt)
st.write(answer)

Benefits and Limitations of Retrieval-Augmented Generation (RAG)

RAG systems offer unique advantages, especially for ML researchers:

Access to Up-to-Date Information: RAG lets you pull information from the latest sources.

Enhanced Trust: Answers grounded in source documents make results more reliable.

Easy Setup: RAGs are relatively straightforward to implement without needing extensive computing resources.

However, RAG isn’t perfect:

Data Dependence: RAG accuracy hinges on the data fed into it.

Not Always Optimal for Complex Queries: While fine for demos, real-world applications may need extensive tuning.

Limited Context: RAG systems are still limited by the LLM’s context size.

Conclusion

Building a conversational assistant for machine learning research using LLMs and RAG frameworks is achievable with the right tools. By using Papers With Code data, Upstash for vector storage, and Streamlit

for a user interface, you can create a robust application for querying recent research.

Further Exploration Ideas:

Use the full paper text rather than just abstracts.

Experiment with metadata filtering to improve precision.

Explore hybrid retrieval techniques and re-ranking for more relevant results.

Whether you’re an ML enthusiast or a researcher, this approach to interacting with research papers can save time and streamline the learning process.



Source link

Top 2024 Crypto Presales with 10x Growth Potential | Web3Wire

0
Top 2024 Crypto Presales with 10x Growth Potential | Web3Wire


“`html

As the cryptocurrency landscape continues to evolve, investors are perpetually on the lookout for the next big opportunity. One promising avenue in 2024 is crypto presales, where projects are available before official launch. These presales can be a strategic entry point, offering tokens at a discount, and potentially reaping significant returns if the project succeeds. In this post, we will explore some of the top crypto presales in 2024 that have the potential to grow tenfold or more.

Understanding Crypto Presales

Before diving into specific presales, it’s crucial to understand what crypto presales are. Essentially, a presale is a fundraising event in which blockchain startups sell their tokens to early investors before they become available to the general public. This early-phase fundraising provides startups with the much-needed capital to develop their projects.

Benefits of Investing in Crypto Presales:

Access to tokens at a discounted rate compared to post-launch pricesOpportunity to choose projects with innovative solutions and advanced technologyPotential for significant returns on investment if the project succeeds

Top 2024 Crypto Presales to Watch

Let’s explore some promising projects with the potential for substantial growth:

1. Project X: The Future of Decentralized Finance

Project X aims to transform the decentralized finance (DeFi) sector with its innovative platform, which offers new features and improved security. Its focus on creating a more user-friendly DeFi experience has garnered considerable attention from early investors. The team behind Project X comprises seasoned blockchain developers and financial experts.

Objective: Simplify and secure DeFi transactionsUnique Selling Point: User-friendly interface with enhanced security featuresGrowth Potential: High, given the expanding DeFi market

2. GreenToken: Revolutionizing Green Energy Investment

With an increasing emphasis on sustainable solutions, GreenToken aims to revolutionize the way individuals can invest in green energy projects. This blockchain-based platform offers transparency and easy access to a variety of green projects worldwide.

Objective: Enhance investment in green energy projectsUnique Selling Point: Democratize investment opportunities in sustainable energyGrowth Potential: Significant, as sustainability becomes more important globally

3. MetaVerseSpace: Pioneering Virtual Real Estate

The metaverse continues to capture interest, and MetaVerseSpace is at the forefront of virtual real estate. By offering a platform for users to buy, sell, and develop virtual land, this project has already attracted a strong community of early adopters excited about the possibilities within digital landscapes.

Objective: Facilitate virtual real estate transactions in the metaverseUnique Selling Point: Comprehensive platform for virtual land and real estateGrowth Potential: Immense, with increasing interest and investment in metaverse spaces

How to Assess a Crypto Presale

When considering which crypto presales to invest in, it’s important to conduct thorough research. Here are some factors to consider:

Project Fundamentals: Analyze the project’s whitepaper, roadmap, and team credentials.Technology and Innovation: Assess what makes the project stand out in the marketplace.Community Engagement: A strong, active community can be an indicator of future success.Market Opportunity: Understand the industry and scope where the project operates.

Conclusion: The Lucrative Potential of Crypto Presales

Crypto presales in 2024 present exciting opportunities for savvy investors ready to explore innovative projects with substantial growth potential. While investing in presales carries inherent risks, thorough research and strategic selection can help mitigate these risks, paving the way for potentially rewarding experiences.

The projects outlined above are just a glimpse of the emerging technologies and solutions coming our way. Each of these presales embodies significant potential, aligning with evolving market trends and technological advancements. By staying informed and strategic, investors can position themselves advantageously in the ever-evolving world of cryptocurrency.

“`

About Web3Wire Web3Wire – Information, news, press releases, events and research articles about Web3, Metaverse, Blockchain, Artificial Intelligence, Cryptocurrencies, Decentralized Finance, NFTs and Gaming. Visit Web3Wire for Web3 News and Events, Block3Wire for the latest Blockchain news and Meta3Wire to stay updated with Metaverse News.



Source link

Popular Posts

My Favorites