Web3

Home Web3 Page 145

Post Pectra, Ethereum now targets efficiency with 60 million gas limit expansion

Post Pectra, Ethereum now targets efficiency with 60 million gas limit expansion


Ethereum is eyeing an increase in its gas limit by 66% to 60 million units to improve transaction capacity and network efficiency.

On May 7, Ethereum core developer Parithosh Jayanthi confirmed that the 60 million gas limit would be rolled out on the blockchain’s mainnet network after successfully testing on the Sepolia and Holesky testnets.

He stated:

“We start shipping 60M on sepolia tomorrow, hoodi/holesky shortly after. If its deemed safe and we patch all found bugs, we ship on mainnet.”

The gas limit determines the maximum computational effort a block can carry, which includes basic transactions, smart contract executions, and interactions with decentralized applications.

By increasing this limit, Ethereum can process more activity per block, potentially reducing congestion and enabling faster execution.

According to community resource PumpTheGas, the upgrade could lower the network’s Layer 1 transaction fees by 10% to 30%, depending on network activity.

Meanwhile, this would mark the second gas limit increase for this year when completed.

In February, a similar move saw the limit rise from 30 million to 36 million. Notably, this was the first gas limit adjustment since 2021.

Ethereum validators rally behind upgrade

The proposed increase has received significant support from Ethereum validators and leading voices in the ecosystem.

Available data shows that nearly 80% of Ethereum validators support increasing the current gas limit from 36 million. Around 10,000 have signaled readiness to implement the higher limit of 60 million.

Ethereum Validators Support Gas Limit Increase (Source: Gaslimit.pics)

Ethereum Foundation researcher Justin Drake confirmed his validator is already configured for the change. He described the move as safe, especially after recent improvements introduced by the Pectra update.

Ethereum core developer Eric Conner also urged others to follow suit, emphasizing the long-term benefits of a higher gas ceiling.

Meanwhile, this widespread support is consistent with previous calls from Ethereum co-founder Vitalik Buterin and researcher Dankrad Feist, who have advocated for expanding the base-layer capacity.

Buterin has repeatedly emphasized the need to scale Ethereum’s base layer, suggesting a tenfold increase in gas limits to support growing demand despite the growth of Layer 2 solutions.

Mentioned in this article



Source link

G7 to Discuss Crypto Hacks, Tackle North Korean Issue: Report – Decrypt

G7 to Discuss Crypto Hacks, Tackle North Korean Issue: Report – Decrypt



In brief

G7 leaders are expected to discuss North Korea’s use of crypto hacks to fund weapons programs next month, amid heightened geopolitical tensions.
The DPRK has stolen billions in crypto through advanced cyberattacks, including high-profile breaches like Axie Infinity and Bybit.
Officials are increasingly concerned about Pyongyang’s deepening ties with Russia and the growing sophistication of its cyber units.

Leaders from the Group of Seven countries are reportedly looking to discuss how the numerous crypto hacks and malicious cyber activities that North Korea has engaged in for years could be addressed and mitigated.

Discussions will likely center around mounting geopolitical concerns and trade tensions between the U.S. and the G7’s other member countries, according to an initial report from Bloomberg citing sources with knowledge of the summit’s plans.

North Korea’s actions have been characterized as alarming, with investigations revealing how stolen funds, most of which stem from the crypto industry, have been used to support its development of weapons of mass destruction.

The summit, hosted by Canadian Prime Minister Mark Carney at Kananaskis, Alberta, will run from June 15 to 17, bringing leaders from the world’s seven most advanced economies.

G7 member countries include France, Germany, Italy, Japan, the U.K., the U.S., and this year’s host, Canada.

Crypto hacking playbook

The Democratic People’s Republic of Korea (DPRK) has established a rigorous cyber operation in recent years, often possessing sophisticated means to target and seize digital assets from vulnerable protocols.

Its offensives have gained notoriety for employing increasingly complex tactics, resulting in major losses, some of which have included Axie Infinity’s loss of $622 million or WazirX’s $230 million theft.

The discussions come as North Korea has shown deepening ties with Russia in recent years, which have included sending military aid to the former Soviet empire amid its ongoing invasion of Ukraine.

Those ties heighten threats “by sharing tools and expertise, complicating attribution and response efforts,” Luis Lubeck, project manager at crypto cybersecurity firm Hacken, told Decrypt in December last year.

Beyond the more commonly known Lazarus Group, other threat actors are working behind the scenes, according to an analysis of the regime’s operational structure by blockchain security expert Samczsun of research-driven crypto investment firm Paradigm.

In February, crypto exchange Bybit suffered the industry’s largest hack at $1.4 billion, with North Korea’s Lazarus Group pegged as the culprit by cybersecurity firms and later confirmed by federal authorities.

Edited by Sebastian Sinclair

Daily Debrief Newsletter

Start every day with the top news stories right now, plus original features, a podcast, videos and more.



Source link

Data Integration Software Market Is Booming So Rapidly | Big Giants Microsoft, IBM, Informatica | Web3Wire

Data Integration Software Market Is Booming So Rapidly | Big Giants Microsoft, IBM, Informatica | Web3Wire


Data Integration Software Market

HTF MI just released the Global Data Loss Prevention (DLP) Solutions Market Study, a comprehensive analysis of the market that spans more than 143+ pages and describes the product and industry scope as well as the market prognosis and status for 2025-2032. The marketization process is being accelerated by the market study’s segmentation by important regions. The market is currently expanding its reach.

Major companies profiled in Data Loss Prevention (DLP) Solutions Market are:

Microsoft Corporation, Broadcom Inc., Cisco Systems, Inc., Fortra, LLC, Cloudflare, Inc., IBM Corporation, McAfee, Inc., Endpoint Protector by CoSoSys Ltd., Imperva, Ekran Systems, Symantec, Digital Guardian, Trend Micro, Trustwave, Code Green Network, Zecurion.

Request PDF Sample Copy of Report: (Including Full TOC, List of Tables & Figures, Chart) @👉 https://www.htfmarketreport.com/sample-report/3307571-data-loss-prevention-2?utm_source=Saroj_openpr&utm_id=Saroj

HTF Market Intelligence projects that the global Data Loss Prevention (DLP) Solutions market will expand at a compound annual growth rate (CAGR) of 24.1 % from 2025 to 2032, from 2.21 in 2025 to 10.05 by 2032.

The following Key Segments Are Covered in Our Report

By TypeNetwork DLP, Endpoint DLP, Cloud DLP, Email DLP, Storage DLP

By ApplicationHealthcare, BFSI, IT and Telecom, Government, Retail

Definition:DLP solutions are security measures that detect and prevent potential data breaches by monitoring, detecting, and blocking sensitive data while in use, in motion, and at rest.

Market Trends:• AI and Machine Learning Integration, Behavioral Analytics, Cloud-native DLP Solutions, Zero Trust Security Models, Managed Security Services

Market Drivers:• Increasing Data Breaches, Regulatory Compliance Requirements, Cloud Adoption, Remote Work Trends, Insider Threats

Market Challenges:• Complex Implementation, High Costs, False Positives, User Resistance, Evolving Threat Landscape

Dominating Region:• North America

Fastest-Growing Region:• Asia Pacific

Have different Market Scope & Business Objectives; Enquire for customized study 👉https://www.htfmarketreport.com/enquiry-before-buy/3307571-data-loss-prevention-2?utm_source=Saroj_openpr&utm_id=Saroj

The titled segments and sub-section of the market are illuminated below:In-depth analysis of Data Loss Prevention (DLP) Solutions Market segments by Types: Network DLP, Endpoint DLP, Cloud DLP, Email DLP, Storage DLPDetailed analysis of Data Loss Prevention (DLP) Solutions Market segments by Applications: Textiles, Home Furnishings, Automotive, Industrial Use

Geographically, the detailed analysis of consumption, revenue, market share, and growth rate of the following regions:

• North America: United States of America (US), Canada, and Mexico.• South & Central America: Argentina, Chile, Colombia, and Brazil.• Middle East & Africa: Kingdom of Saudi Arabia, United Arab Emirates, Turkey, Israel, Egypt, and South Africa.• Europe: the UK, France, Italy, Germany, Spain, Nordics, BALTIC Countries, Russia, Austria, and the Rest of Europe.• Asia: India, China, Japan, South Korea, Taiwan, Southeast Asia (Singapore, Thailand, Malaysia, Indonesia, Philippines & Vietnam, etc.) & Rest• Oceania: Australia & New Zealand

Buy Now Latest Edition Data Loss Prevention (DLP) Solutions Market Report 👉https://www.htfmarketreport.com/reports/3307571-data-loss-prevention-2

Data Loss Prevention (DLP) Solutions Market Research Objectives:

– Focuses on the key manufacturers, to define, pronounce and examine the value, sales volume, market share, market competition landscape, SWOT analysis, and development plans in the next few years.– To share comprehensive information about the key factors influencing the growth of the market (opportunities, drivers, growth potential, industry-specific challenges and risks).– To analyze the with respect to individual future prospects, growth trends and their involvement to the total market.– To analyze reasonable developments such as agreements, expansions new product launches, and acquisitions in the market.– To deliberately profile the key players and systematically examine their growth strategies.

FIVE FORCES & PESTLE ANALYSIS:Five forces analysis-the threat of new entrants, the threat of substitutes, the threat of competition, and the bargaining power of suppliers and buyers-are carried out to better understand market circumstances.• Political (Political policy and stability as well as trade, fiscal, and taxation policies)• Economical (Interest rates, employment or unemployment rates, raw material costs, and foreign exchange rates)• Social (Changing family demographics, education levels, cultural trends, attitude changes, and changes in lifestyles)• Technological (Changes in digital or mobile technology, automation, research, and development)• Legal (Employment legislation, consumer law, health, and safety, international as well as trade regulation and restrictions)• Environmental (Climate, recycling procedures, carbon footprint, waste disposal, and sustainability)

Points Covered in Table of Content of Global Data Loss Prevention (DLP) Solutions Market:

Chapter 01 – Data Loss Prevention (DLP) Solutions Market Executive SummaryChapter 02 – Market OverviewChapter 03 – Key Success FactorsChapter 04 – Global Data Loss Prevention (DLP) Solutions Market – Pricing AnalysisChapter 05 – Global Data Loss Prevention (DLP) Solutions Market Background or HistoryChapter 06 – Global Data Loss Prevention (DLP) Solutions Market Segmentation (e.g. Type, Application)Chapter 07 – Key and Emerging Countries Analysis Worldwide Polyester Fiber MarketChapter 08 – Global Data Loss Prevention (DLP) Solutions Market Structure & worth AnalysisChapter 09 – Global Data Loss Prevention (DLP) Solutions Market Competitive Analysis & ChallengesChapter 10 – Assumptions and AcronymsChapter 11 – Data Loss Prevention (DLP) Solutions Market Research Method Polyester Fiber

Thank you for reading this post. You may also obtain report versions by area, such as North America, LATAM, Europe, Japan, Australia, or Southeast Asia, or by chapter.

Nidhi Bhawsar (PR & Marketing Manager)HTF Market Intelligence Consulting Private LimitedPhone: +15075562445sales@htfmarketintelligence.com

About Author:HTF Market Intelligence Consulting is uniquely positioned to empower and inspire with research and consulting services to empower businesses with growth strategies, by offering services with extraordinary depth and breadth of thought leadership, research, tools, events, and experience that assist in decision-making.

This release was published on openPR.

About Web3Wire Web3Wire – Information, news, press releases, events and research articles about Web3, Metaverse, Blockchain, Artificial Intelligence, Cryptocurrencies, Decentralized Finance, NFTs and Gaming. Visit Web3Wire for Web3 News and Events, Block3Wire for the latest Blockchain news and Meta3Wire to stay updated with Metaverse News.



Source link

WisdomAI Snags $23M to Transform Enterprise Decision-Making with Next-Gen, Reliable AI Analytics – Web3oclock

WisdomAI Snags M to Transform Enterprise Decision-Making with Next-Gen, Reliable AI Analytics – Web3oclock


A Smarter, Faster, Cleaner Way to Tap Enterprise Data:

Why It Stands Out?

Early Adoption and Enterprise Momentum:

Future Outlook:



Source link

Sett Secures $27M to Transform the Future of AI-Driven Game Creation and Marketing – Web3oclock

Sett Secures M to Transform the Future of AI-Driven Game Creation and Marketing – Web3oclock




Source link

Metaplanet Buys More Bitcoin Amid Easing US-China Trade Tensions – Decrypt

Metaplanet Buys More Bitcoin Amid Easing US-China Trade Tensions – Decrypt



In brief

Metaplanet bought 555 more BTC, bringing its total holdings to 5,555 Bitcoin, worth roughly $370 million.
China confirmed trade talks with the U.S., but warned against “coercive and blackmailing tactics” ahead of May 9–12 meetings in Switzerland.
The Bitcoin buy follows Metaplanet’s expansion push, including a new Florida-based subsidiary aimed at raising $250M from U.S. capital markets.

Japan’s Metaplanet Inc. added another 555 Bitcoin to its treasury late Tuesday, just as China confirmed it would resume trade talks with the U.S., while laying out firm conditions for engagement and warning against “coercive and blackmailing tactics.”

The Tokyo-listed firm disclosed it spent $49.6 million (¥7.67 billion) on the latest acquisition, bringing its total Bitcoin holdings to 5,555 BTC, worth over $482 million at current prices.  

The purchase is part of Metaplanet’s broader strategy to convert capital into Bitcoin amid growing global macro uncertainty.

Metaplanet’s latest buy arrived just as Beijing and Washington signaled a willingness to resume economic dialogue, with China confirming Vice Premier He Lifeng would meet U.S. Treasury Secretary Scott Bessent.

He Lifeng is expected to meet Bessent in Switzerland from May 9 to 12, which would commence the first formal economic talks between the two powers since President Donald Trump reimposed tariffs on a wide range of Chinese imports, some exceeding 140%.

The meeting follows what China’s Ministry of Commerce described as “repeated” efforts by senior U.S. officials to re-engage. 

But China made clear that any negotiation must be based on mutual respect. 

“If the U.S. says one thing but does another, or even attempts to use negotiations as a pretext to continue coercive and blackmailing tactics, China will never agree, nor will it sacrifice its principles or international fairness and justice to seek any agreement,” the Ministry of Commerce said in a statement published by state media outlet CGTN.

That warning followed a tweet from China’s embassy in the U.S., which said Beijing had agreed to re-engage with Washington “based on full consideration of global expectations, China’s interests, and the appeals of the U.S. business community and consumers.”

The comment followed weeks of ambiguity in Washington, with President Trump claiming to have spoken “many times” with China’s President Xi Jinping.

“I have a lot of jobs around the White House; running the switchboard isn’t one of them,” Bessent told reporters last week when asked about the alleged calls.

Investors, long nervous about rate policy and geopolitical flashpoints, are watching Bitcoin, gold, and Treasuries for macro cues, Decrypt reported.

Bitcoin has held close to its February highs over the last 24 hours, while gold climbed to $3,357 amid renewed safe-haven demand. 

The world’s largest crypto is currently trading for $96,500, up 2.1% in the last 24 hours, CoinGecko data shows.

Funding the Bitcoin Playbook

Since early 2024, Metaplanet has issued multiple tranches of zero-coupon bonds and stock acquisition rights, raising over ¥35 billion through its partner EVO FUND. 

Those proceeds have been funneled directly into Bitcoin, with each tranche tied closely to a new purchase. 

The firm’s Bitcoin-centric KPI, BTC Yield, has grown sharply over three consecutive quarters,  309.8% in Q4 2024, 95.6% in Q1 2025, and 21% in Q2 of this year.

Last week, the Tokyo-based firm announced plans to establish a wholly owned U.S. subsidiary in Florida, dubbed Metaplanet Treasury Corp., aiming to raise up to $250 million to accelerate its Bitcoin accumulation strategy. 

Metaplanet CEO Simon Gerovich said the move is an effort to tap American capital markets and expand 24/7 access to global liquidity.

Edited by Sebastian Sinclair

Daily Debrief Newsletter

Start every day with the top news stories right now, plus original features, a podcast, videos and more.



Source link

Crypto VC funds struggle to capture money as startup fundraising rebounds in 2025

Crypto VC funds struggle to capture money as startup fundraising rebounds in 2025



Crypto venture capital (VC) firms are experiencing operational strain and consolidation, even as project-level fundraising gains momentum. 

In the first quarter, crypto startups raised $5.85 billion, already accounting for nearly 61% of the capital raised throughout 2024, according to DefiLlama. 

Varys Capital head of venture Tom Dunleavy shared that, despite this influx, fewer active funds are deploying capital, and many firms launched during the last market cycle are no longer consistently participating in deals.

He attributed the pullback to dwindling capital reserves and a lack of meaningful returns and described the situation as “massive consolidation coming in crypto VC.”

Dunleavy noted that many funds raised in 2021 and 2022 are “shadow insolvent,” out of capital but still nominally active. He projected that many non-brand-name firms, and even some established names, will be functionally closed by 2026.

Crypto VC funds vs. startups

Galaxy Research data shows that while startup fundraising is recovering, venture capital funds are raising less money to invest in crypto projects. 

Additionally, the number of new crypto VC funds peaked in 2022 at more than 300 but has steadily declined yearly. Only around 50 new funds were launched in 2024, and just a fraction of that number entered the market in the first quarter of 2025. 

The number of repeat investors has also shrunk. DefiLlama data shows that of all active funds in the past 180 days, only 67 made more than one investment, which is less than half.

Dunleavy cited several causes, including the absence of distributions to paid-in capital (DPI), a lack of headline investment wins to renew attention from capital allocators, and slower inflows from ultra- and high-net-worth individuals. 

He added that institutional investors remain hesitant despite recent regulatory progress across jurisdictions.

Contraction in venture capital

The fundraising side does not mirror the contraction seen with venture firms. The increase in the first-quarter fundraising volumes suggests that interest in crypto startups is growing. However, capital flows from a narrower base of repeat participants and larger allocators.

As a result, venture activity is becoming more concentrated. Capital is no longer widely distributed across many generalist funds but is instead focused within a smaller group of active players with sufficient dry powder and differentiated theses. 

Dunleavy believes this new landscape is likely a massive positive development for the industry, as venture capital funds are much sharper with whom they deploy capital, resulting in better companies thriving.

The crypto fundraising landscape is entering a bifurcated phase. While startups continue to raise money faster than last year, crypto VC funds struggle to justify their relevance, raise new capital, and remain active in a leaner, more disciplined market.



Source link

Join the 2025 Web3j Mentorships: Build the Future of Ethereum on the JVM

Join the 2025 Web3j Mentorships: Build the Future of Ethereum on the JVM


The Linux Foundation’s Decentralized Trust (LFDT) mentorship program is back in 2025 with two exciting opportunities for developers passionate about Ethereum, Android, and the Java Virtual Machine (JVM). Whether you’re a seasoned blockchain developer or an Android enthusiast eager to dive into Web3, these mentorships offer a unique chance to contribute to the Web3j ecosystem.​

The Web3j-Android library is pivotal for developers integrating Android applications with the Ethereum blockchain, enabling them to leverage Web3j’s robust capabilities within a mobile environment. Following the success of the previous mentorship program in 2024, the Web3j-Android library was significantly upgraded from version 4.8.8 to version 4.12.3-android, effectively resolving challenges related to coherence, functionality, and alignment with modern Android development practices.​

This project proposes to build upon the accomplishments of the prior mentorship by further enhancing and expanding the functionalities of the Web3j-Android library. Given Kotlin’s prominence as the preferred programming language for Android development, this initiative aims specifically to facilitate the generation of Kotlin-based smart contract wrappers. Achieving this objective will ensure the library fully aligns with contemporary development standards for both Android applications and Ethereum blockchain technologies.​

After this is achieved, the project will continue with improving the release process and the keep-up-to-date procedure of Web3j-Android with the main Web3j library.​

📌 Application Link: Apply Here

Recommended Skills:

Strong understanding of Android development: Knowledge of Android SDK, Android Studio, and the Android app development lifecycle.Very good understanding of Java and Kotlin: Since Android development can involve both languages, familiarity with Java and Kotlin is essential, especially considering the existing Web3j-Android library is in Java, and new developments in Android often leverage Kotlin.Experience with Ethereum and smart contracts: Understanding of Ethereum blockchain, smart contracts, and how they interact with Android applications using Web3j.Familiarity with Web3j library: A grasp of the existing Web3j library functionalities, structures, and its application in Android environments.Problem-solving skills: Ability to diagnose and troubleshoot complex issues that may arise during development or integration of blockchain technologies in Android applications.Version control with Git: Experience with Git for version control to manage code changes and collaborate with other developers.Communication and collaboration skills: Ability to work effectively in a team, articulate ideas clearly, and collaborate on complex projects.​

The Web3j library is an essential tool for Java and JVM-based developers looking to integrate with the Ethereum blockchain. It provides seamless Ethereum client communication, enabling developers to interact with smart contracts and blockchain networks. However, maintaining and improving Web3j while ensuring seamless adoption of the latest Ethereum Improvement Proposals (EIPs) remains a challenge for maintainers due to the lack of time and contributors.​

This project aims to enhance the core Web3j by improving and bringing up to date its component libraries such as Web3j-Unit, Web3j-EVM, and Web3j-OpenAPI, focusing on reducing existing issues and integrating the missing EIPs. By improving the additional Web3j libraries, we aim to streamline development workflows, enhance testability, and create a more robust framework that simplifies Ethereum development for Java and JVM-based projects. The updates will address known gaps in the libraries, ensuring compatibility with the latest Ethereum standards while improving documentation, developer experience, and maintainability.​

The initiative will introduce structural improvements, new features, and documentation for better integration and testing, making Web3j a more sustainable and adaptable Web3 development tool project.​

📌 Application Link: Apply Here

Recommended Skills:

Strong understanding of Java development: Knowledge of Java versions, IntelliJ IDEA, Gradle, GitHub.Very good understanding of Java and Kotlin.Background in computer science, computer engineering, or equivalent.Experience with Ethereum and smart contracts: Understanding of Ethereum blockchain, smart contracts, and how they interact with applications using Web3j.Familiarity with Web3j library: A grasp of the existing Web3j library functionalities and structures.Problem-solving skills: Ability to diagnose and troubleshoot complex issues that may arise during development or integration of blockchain technologies in JVM applications.Version control with Git: Experience with Git for version control to manage code changes and collaborate with other developers.Communication and collaboration skills: Ability to work effectively in a team, articulate ideas clearly, and collaborate on complex projects.​

Both mentorships will be guided by two seasoned professionals from Web3 Labs:​

George Ţebrean: A Senior Java Developer and experienced open-source contributor, George brings over 8 years of expertise in Java development and blockchain integration.Nischal Sharma: A Blockchain Developer at Web3 Labs and maintainer of Web3j, Nischal has extensive experience in Ethereum development and has previously mentored projects under the Linux Foundation’s Hyperledger program.

Their combined expertise ensures that mentees will receive comprehensive guidance throughout the program.​

These mentorships are more than just projects — they’re gateways to becoming integral parts of the Web3j and Ethereum developer communities. You’ll gain hands-on experience, mentorship from industry experts, and the opportunity to make a lasting impact on tools used by developers worldwide.

📌 Application Link: Apply Here



Source link

South East Asia Public Cloud Market 2025 Edition: Size, Share, Industry Growth, Trends, Research Report 2033 | Web3Wire

South East Asia Public Cloud Market 2025 Edition: Size, Share, Industry Growth, Trends, Research Report 2033 | Web3Wire


South East Asia Public Cloud Market

South East Asia Public Cloud Market Outlook

Base Year: 2024

Historical Years: 2019-2024

Forecast Years: 2025-2033

Market Size in 2024: USD 32,033.8 Million

Market Forecast in 2033: USD 1,07,427.1

Market Growth Rate: 14.39% (2025-2033)

The South East Asia public cloud market size reached USD 32,033.8 Million in 2024. Looking forward, IMARC Group expects the market to reach USD 1,07,427.1 Million by 2033, exhibiting a growth rate (CAGR) of 14.39% during 2025-2033.

South East Asia Public Cloud Market Trends:

The public cloud market in South East Asia continues to grow quickly as different industries realize their digital transformation initiatives, speed up their response to cloud-native technologies, and make greater investments in modernization of infrastructures. Essentially, enterprises are moving from traditional, monolithic IT systems to cloud platforms that can provide scalability to improve operational agility, reduce costs and support their remote labour forces. The rise of e-commerce, fintech and digital services is increased demand for cloud systems that are flexible and secure. Additionally, regulatory support and a greater awareness of data sovereignty are forcing providers to localize data centres within South East Asia to help resolve compliance and reduce latency.

Small and medium-sized enterprises are also adopting cloud services to create competitive advantages – including utilizing solutions for actionable analytics, artificial intelligence, continuous cybersecurity breaches monitoring, and web applications related to financial services or retail services. The hyperscalers are also increasing their presence in South East Asia, particularly further in markets such as Indonesia, Malaysia, and Vietnam, through strategic partnerships and launching these markets in new availability zones. Multi-cloud and hybrid cloud strategies are also becoming equally popular, as organizations shift out of vendor lock-in, build business continuity, and maximize expertise. In summary, public cloud is becoming far more accessible, competitive and customized for all industries from manufacturing to healthcare to education. South East Asia’s public cloud market provides many opportunities to participants of all shapes and sizes to grow.

For an in-depth analysis, you can refer free sample copy of the report: https://www.imarcgroup.com/south-east-asia-public-cloud-market/requestsample

South East Asia Public Cloud Market Scope and Growth Analysis:

The public cloud market in South East Asia continues to develop rapidly owing to rapid digital transformation across all segments of the market, increased mobile internet penetration and increased cloud-native technologies being adopted across industries. In general, countries like Singapore, Indonesia and Malaysia are leading the transition, backed by supportive government policy and increasing investments in cloud infrastructure. Moreover, the growing demand for scalable and cost-efficient IT solutions for both SMEs and large enterprises is increasing the adoption of IaaS, PaaS and SaaS products. Additionally, the increased emphasis on data localization and regulatory compliance; especially in relation to government, has forced a number of global cloud service providers to rollout an increased number of data centers in the region to overcome their clients’ legal and regulatory obligations.

Finally, emerging technologies like AI, IoT and machine learning have increased the demand for high performance computing environments that the public cloud platforms can deliver. Summary, the digital economy has spurred a massive explosion of digital services and platforms at a vast speed (especially in banking, e-commerce, healthcare and education) creating a fast-growing cloud customer base. In addition, partnerships between local companies and global hyperscalers are creating innovative new routes to market and experience for local companies. Indeed, the markets continue to move at pace, providing a wealth of opportunities for many stakeholders to take advantage of the speed of the digital economy.

South East Asia Public Cloud Market Report Segmentation:

The market report offers a comprehensive analysis of the segments, highlighting those with the largest South East Asia Public Cloud Market share. It includes forecasts for the period 2025-2033 and historical data from 2019-2024 for the following segments.

Service Insights:

• Infrastructure as a Service (IaaS)• Platform as a Service (PaaS)• Software as a Service (SaaS)

Enterprise Size Insights:

• Large Enterprises• Small and Medium-sized Enterprises

End Use Insights:

• BFSI• IT and Telecom• Retail and Consumer Goods• Manufacturing• Energy and Utilities• Healthcare• Media and Entertainment• Government and Public Sector• Others

Country Insights:

• Indonesia• Thailand• Singapore• Philippines• Vietnam• Malaysia• Others

Competitive Landscape:

The report offers an in-depth examination of the competitive landscape. It includes a thorough competitive analysis encompassing market structure, key player positioning, leading strategies for success, a competitive dashboard, and a company evaluation quadrant.

Ask Analyst for Customization: https://www.imarcgroup.com/request?type=report&id=20953&flag=C

Key highlights of the Report:

• Recent Industry News• Key Technological Trends & Development• COVID-19 Impact on the Market• Porter’s Five Forces Analysis• Strategic Recommendations• Market Dynamics• Historical, Current and Future Market Trends• Market Drivers and Success Factors• SWOT Analysis• Value Chain Analysis• Comprehensive Mapping of the Competitive Landscape• Top Winning Strategies

Explore More Research Reports & Get Your Free Sample Now!

South East Asia Luxury Fashion Market: https://www.imarcgroup.com/south-east-asia-luxury-fashion-market/requestsample

South East Asia Tire Market: https://www.imarcgroup.com/south-east-asia-tire-market/requestsample

Note: If you need specific information that is not currently within the scope of the report, we can provide it to you as a part of the customization.

Contact US:

IMARC Group134 N 4th St. Brooklyn, NY 11249, USAEmail: sales@imarcgroup.comTel No:(D) +91 120 433 0800United States: +1-631-791-1145

About US:

IMARC Group is a global management consulting firm that helps the world’s most ambitious changemakers to create a lasting impact. The company provide a comprehensive suite of market entry and expansion services.

IMARC offerings include thorough market assessment, feasibility studies, company incorporation assistance, factory setup support, regulatory approvals and licensing navigation, branding, marketing and sales strategies, competitive landscape and benchmarking analyses, pricing and cost research, and procurement research.

This release was published on openPR.

About Web3Wire Web3Wire – Information, news, press releases, events and research articles about Web3, Metaverse, Blockchain, Artificial Intelligence, Cryptocurrencies, Decentralized Finance, NFTs and Gaming. Visit Web3Wire for Web3 News and Events, Block3Wire for the latest Blockchain news and Meta3Wire to stay updated with Metaverse News.



Source link

How to Create a PDF Chatbot Using RAG, Chunking, and Vector Search

How to Create a PDF Chatbot Using RAG, Chunking, and Vector Search


Interacting with documents has evolved dramatically. Tools like Perplexity, ChatGPT, Claude, and NotebookLM have revolutionized how we engage with PDFs and technical content. Instead of tediously scrolling through pages, we can now receive instant summaries, answers, and explanations. But have you ever wondered what happens behind the scenes?

Let me guide you through creating your PDF chatbot using Python, LangChain, FAISS, and a local LLM like Mistral. This isn’t about building a competitor to established solutions – it’s a practical learning journey to understand fundamental concepts like chunking, embeddings, vector search, and Retrieval-Augmented Generation (RAG).

Understanding the Technical Foundation

Before diving into code, let’s understand our technology stack. We’ll use Python with Anaconda for environment management, LangChain as our framework, Ollama running Mistral as our local language model, FAISS as our vector database, and Streamlit for our user interface.

Harrison Chase launched LangChain in 2022. It simplifies application development with language models and provides the tools to process documents, create embeddings, and build conversational chains.

FAISS (Facebook AI Similarity Search) specializes in fast similarity searches across large volumes of text embeddings. We’ll use it to store our PDF text sections and efficiently search for matching passages when users ask questions.

Ollama is a local LLM runtime server that allows us to run models like Mistral directly on our computer without a cloud connection. This gives us independence from API costs and internet requirements.

Streamlit enables us to quickly create a simple web application interface using Python, making our chatbot accessible and user-friendly.

Setting Up the Environment

Let’s start by preparing our environment:

First, ensure Python is installed (at least version 3.7). We’ll use Anaconda to create a dedicated environment conda create—n pdf chatbot python=3.10 and activate it with conda activate pdf chatbot.

Create a project folder with mkdir pdf-chatbot and navigate to it using cd pdf-chatbot.

Create a requirements.txt file in this directory with the following packages:

Install all required packages with pip install -r requirements.txt.

Install Ollama from the official download page, then verify the installation by checking the version with ollama –version.

In a separate terminal, activate your environment and run Ollama with the Mistral model using ollama run mistral.

Building the Chatbot: A Step-by-Step Guide

We aim to create an application that lets users ask questions about a PDF document in natural language and receive accurate answers based on the document’s content rather than general knowledge. We’ll combine a language model with intelligent document search to achieve this.

Structuring the Project

We’ll create three separate files to maintain a clean separation between logic and interface:

chatbot_core.py – Contains the RAG pipeline logic

streamlit_app.py – Provides the web interface

chatbot_terminal.py – Offers a terminal interface for testing

The Core RAG Pipeline

Let’s examine the heart of our chatbot in chatbot_core.py:

from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
from langchain.chat_models import ChatOllama
from langchain.chains import ConversationalRetrievalChain

def build_qa_chain(pdf_path=”example.pdf”):
loader = PyPDFLoader(pdf_path)
documents = loader.load()[1:] # Skip page 1 (element 0)
splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=100)
docs = splitter.split_documents(documents)

embeddings = HuggingFaceEmbeddings(model_name=”sentence-transformers/all-MiniLM-L6-v2″)

db = FAISS.from_documents(docs, embeddings)
retriever = db.as_retriever()
llm = ChatOllama(model=”mistral”)
qa_chain = ConversationalRetrievalChain.from_llm(

llm=llm,
retriever=retriever,
return_source_documents=True

)
return qa_chain

This function builds a complete RAG pipeline through several crucial steps:

Loading the PDF: We use PyPDFLoader to read the PDF into document objects that LangChain can process. We skip the first page since it contains only an image.

Chunking: We split the document into smaller sections of 500 characters with 100-character overlaps. This chunking is necessary because language models like Mistral can’t process entire documents at once. The overlap preserves context between adjacent chunks.

Creating Embeddings: We convert each text chunk into a mathematical vector representation using HuggingFace’s all-MiniLM-L6-v2 model. These embeddings capture the semantic meaning of the text, allowing us to find similar passages later.

Building the Vector Database: We store our embeddings in a FAISS vector database specializing in similarity searches. FAISS enables us to find text chunks that match a user’s query quickly.

Creating a Retriever: The retriever acts as a bridge between user questions and our vector database. When someone asks a question, the system creates a vector representation of that question and searches the database for the most similar chunks.

Integrating the Language Model: We use the locally running Mistral model through Ollama to generate natural language responses based on the retrieved text chunks.

Building the Conversational Chain: Finally, we create a conversational retrieval chain that combines the language model with the retriever, enabling back-and-forth conversation while maintaining context.

This approach represents the essence of RAG: improving model outputs by enhancing the input with relevant information from an external knowledge source (in this case, our PDF).

Creating the User Interface

Next, let’s look at our Streamlit interface in streamlit_app.py:

import streamlit as st
from chatbot_core import build_qa_chain

st.set_page_config(page_title=”📄 PDF-Chatbot”, layout=”wide”)
st.title(“📄 Chat with your PDF”)

qa_chain = build_qa_chain(“example.pdf”)
if “chat_history” not in st.session_state:

st.session_state.chat_history = []

question = st.text_input(“What would you like to know?”, key=”input”)
if question:
result = qa_chain({
“question”: question,
“chat_history”: st.session_state.chat_history
})

st.session_state.chat_history.append((question, result[“answer”]))
for i, (q, a) in enumerate(st.session_state.chat_history[::-1]):

st.markdown(f”**❓ Question {len(st.session_state.chat_history) – i}:** {q}”)
st.markdown(f”**🤖 Answer:** {a}”)

This interface provides a simple way to interact with our chatbot. It sets up a Streamlit page, builds our QA chain using the specified PDF, initializes a chat history, creates an input field for questions, processes those questions through our QA chain, and displays the conversation history.

Terminal Interface for Testing

We also create a terminal interface in chatbot_terminal.py for testing purposes:

from chatbot_core import build_qa_chain

qa_chain = build_qa_chain(“example.pdf”)

chat_history = []

print(“🧠 PDF-Chatbot started! Enter ‘exit’ to quit.”)

while True:

query = input(“\n❓ Your questions: “)

if query.lower() in [“exit”, “quit”]:

print(“👋 Chat finished.”)

break

result = qa_chain({“question”: query, “chat_history”: chat_history})

print(“\n💬 Answer:”, result[“answer”])

chat_history.append((query, result[“answer”]))

print(“\n🔍 Source – Document snippet:”)

print(result[“source_documents”][0].page_content[:300])

This version lets us interact with the chatbot through the terminal, showing answers and the source text chunks used to generate those answers. This transparency is valuable for learning and debugging.

Running the Application

To launch the Streamlit application, we run streamlit run streamlit_app.py in our terminal. The app opens automatically in a browser, where we can ask questions about our PDF document.

Future Improvements

While our current implementation works, several enhancements could make it more practical and user-friendly:

Performance Optimization: The current setup might take around two minutes to respond. We could improve this with a faster LLM or additional computing resources.

Public Accessibility: Our app runs locally, but we could deploy it on Streamlit Cloud to make it publicly accessible.

Dynamic PDF Upload: Instead of hardcoding a specific PDF, we could add an upload button to process any PDF the user chooses.

Enhanced User Interface: Our simple Streamlit app could benefit from better visual separation between questions and answers and from displaying PDF sources for answers.

The Power of Understanding

Building this PDF chatbot yourself provides deeper insight into the key technologies powering modern AI applications. You gain practical knowledge of how these systems function by working through each step, from chunking and embeddings to vector databases and conversational chains.

This approach’s power lies in its combination of local LLMs and document-specific knowledge retrieval. By focusing the model only on relevant content from the PDF, we reduce the likelihood of hallucinations while providing accurate, contextual answers.

This project demonstrates how accessible these technologies have become. With open-source tools like Python, LangChain, Ollama, and FAISS, anyone with basic programming knowledge can build a functional RAG system that brings documents to life through conversation.

As you experiment with your implementation, you’ll develop a more intuitive understanding of what makes modern AI document interfaces work, preparing you to build more sophisticated applications in the future. The field is evolving rapidly, but the fundamental concepts you’ve learned here will remain relevant as AI continues transforming how we interact with information.



Source link

Popular Posts

My Favorites

Steam Deck sales still going strong over three years later

Back in February we hit the three year mark of Valve's Steam Deck with SteamOS Linux, and as we approach the release of...