BettaFish is a free open-source multi-agent system that analyzes public opinion across 30+ social media platforms and millions of public comments, and generates comprehensive research reports.
It combines AI crawler clusters, sentiment analysis models, and a collaborative agent network to break through information silos and deliver data-driven insights for decision-making.
Instead of presenting raw metrics, BettaFish deploys specialized AI agents that work together like a research team. Each focuses on different aspects of data collection and analysis before collaborating through a “forum” mechanism to generate nuanced reports.
This approach has resonated with developers and researchers, as evidenced by the project garnering 9,600+ GitHub stars since its release.
Features
Multi-Agent Collaboration System: Four specialized agents (Query, Media, Insight, Report) work in parallel, each equipped with distinct toolsets and analysis capabilities to handle different data sources and content types.
24/7 AI Crawler Clusters: Automated crawlers continuously monitor major social platforms and capture both trending content and millions of user comments in real time.
Multimodal Analysis: Processes text, images, and short video content from social media, plus extracts structured information cards (weather, calendar, stock data) from modern search engines.
Forum Engine Collaboration: Agents engage in debate moderated by an AI host, conducting chain-of-thought discussions to avoid single-model biases and generate higher-quality collective intelligence.
Public-Private Data Integration: Secure interfaces allow seamless connection between external social media data and internal business databases for comprehensive analysis combining external trends with internal insights.
Multiple Sentiment Analysis Models: Integrated collection includes fine-tuned BERT models, multilingual sentiment analyzers, Qwen3 fine-tuning, GPT-2 LoRA models, and traditional machine learning methods.
Custom Report Templates: Built-in template library with support for uploading custom formats, automatically selecting the most appropriate template for each analysis scenario.
Lightweight Python Architecture: Modular design enables one-click deployment and easy integration of custom models and business logic for rapid platform expansion.
How It Works: A Look Inside the Analysis Workflow
BettaFish operates not as a single AI but as a coordinated team of specialized AI agents. The process transforms a simple user question into a comprehensive analytical report through a structured, multi-stage workflow.
1. The Initial Query: The process begins when you type your analysis request, like “What is the public sentiment about Brand X’s new product?”, into the web interface.
2. Simultaneous Investigation: The system immediately deploys three specialized agents that work in parallel:
- Query Agent: Scours the web for news articles, blogs, and general search results to get a broad overview.
- Media Agent: Focuses on multimodal content, analyzing videos and images from platforms like TikTok to understand visual and contextual sentiment.
- Insight Agent: Dives into the database, searching through millions of collected public comments or your own private business data for relevant keywords and discussions.
3. Strategy Formulation: After a quick preliminary search, each agent assesses its initial findings. Based on this overview, it formulates a more detailed, segmented research plan to guide its deep-dive investigation.
4. The “Forum” – Collaborative Deep Dive: This is the core of BettaFish’s unique approach. The agents enter an iterative cycle of research and collaboration managed by the ForumEngine.
- Each agent conducts in-depth research based on its strategy.
- As they uncover information, they share their findings in a virtual “forum.”
- A moderator model (the LLM Host) reads all the communications and generates summaries of the ongoing discussion.
- Crucially, each agent reads the moderator’s summaries and the other agents’ findings. This allows them to refine their own search. For example, if the Query Agent finds a news report about a product recall, the Insight Agent can adjust its database search to look for user comments specifically mentioning that recall. This multi-round cycle of research, sharing, and adjustment ensures the final analysis is cohesive and well-rounded.
5. Synthesizing the Results: Once the collaborative research cycles are complete, the Report Agent steps in. It collects all the final analysis, key findings, and the complete discussion log from the forum.
6. Generating the Report: The Report Agent intelligently selects the best template for the query (e.g., a “Brand Reputation” template versus a “Public Hot Topic” template). It then populates this template with the synthesized information, generating a structured, multi-round HTML report that presents the findings in a clear and organized manner.
Use Cases
Brand Reputation Monitoring: PR teams can track brand mentions across multiple platforms simultaneously, analyzing sentiment shifts in real time to identify potential crises before they escalate. The system drills down into comment sections to capture authentic user opinions that surface-level monitoring might miss.
Market Research and Trend Forecasting: Product managers can analyze consumer reactions to competitor launches, industry trends, and emerging needs by processing millions of social media discussions. The multi-agent approach provides multi-dimensional perspectives that single-model analyses often overlook.
Crisis Management and Response: Organizations facing public scrutiny can use BettaFish to understand the full scope of public sentiment, identify key opinion leaders driving conversations, and track how narratives evolve across different platforms and demographics.
Academic Research on Social Behavior: Researchers studying social phenomena can leverage the crawler system to collect longitudinal data across platforms, applying the built-in sentiment analysis models to examine how public opinion forms and shifts over time.
Financial Market Sentiment Analysis: By modifying API parameters and prompts in the agent toolset, financial analysts can transform BettaFish into a market analysis system that monitors investor sentiment across financial forums, news sites, and social platforms.
How to Use It
1. To get started, make sure your system meets the requirements:
- OS: Windows, Linux, or MacOS.
- Python: Version 3.9 or newer.
- Conda: Anaconda or Miniconda for environment management.
- Database: MySQL (this is optional if you use the cloud service, but new applications are suspended).
2. Clone the project files from GitHub.
git clone https://github.com/666ghj/BettaFish.git
cd BettaFish3. Create an isolated Python environment to avoid conflicts with other projects.
# Create the environment with Python 3.11
conda create -n bettafish_env python=3.11
# Activate the new environment
conda activate bettafish_env5. Install all the required Python libraries listed in the requirements.txt file.
# Note: If you don't plan to use the local sentiment analysis models,
# you can comment out the 'Machine Learning' section in the file first.
pip install -r requirements.txt6. The web crawler component requires browser drivers to function. Install the Chromium driver using Playwright.
playwright install chromium7. Provide your own API keys and database connection details.
- Copy the example configuration file:
cp config.py.example config.py- Open the new
config.pyfile with a text editor and fill in the required information. You will need to add your API key, the base URL for your chosen LLM provider, and the model name for each agent (Insight, Media, etc.). The system works with any provider that uses an OpenAI-compatible API format.
# Connect to your MySQL instance.
DB_HOST = "your_db_host" # For example: "localhost" or "127.0.0.1"
DB_PORT = 3306
DB_USER = "your_db_user"
DB_PASSWORD = "your_db_password"
DB_NAME = "your_db_name"
DB_CHARSET = "utf8mb4"
# Insight Agent
INSIGHT_ENGINE_API_KEY = "your_api_key"
INSIGHT_ENGINE_BASE_URL = "https://api.moonshot.cn/v1"
INSIGHT_ENGINE_MODEL_NAME = "kimi-k2-0711-preview"
# Media Agent
MEDIA_ENGINE_API_KEY = "your_api_key"
MEDIA_ENGINE_BASE_URL = "https://www.chataiapi.com/v1"
MEDIA_ENGINE_MODEL_NAME = "gemini-2.5-pro"
# Query Agent
QUERY_ENGINE_API_KEY = # Your API Key
QUERY_ENGINE_BASE_URL = "https://api.deepseek.com"
QUERY_ENGINE_MODEL_NAME = "deepseek-reasoner"
# Report Agent
REPORT_ENGINE_API_KEY = "your_api_key"
REPORT_ENGINE_BASE_URL = "https://www.chataiapi.com/v1"
REPORT_ENGINE_MODEL_NAME = "gemini-2.5-pro"
# Forum Host
FORUM_HOST_API_KEY = "your_api_key"
FORUM_HOST_BASE_URL = "https://api.siliconflow.cn/v1"
FORUM_HOST_MODEL_NAME = "Qwen/Qwen3-235B-A22B-Instruct-2507"
# SQL keyword Optimizer
KEYWORD_OPTIMIZER_API_KEY = "your_api_key"
KEYWORD_OPTIMIZER_BASE_URL = "https://api.siliconflow.cn/v1"
KEYWORD_OPTIMIZER_MODEL_NAME = "Qwen/Qwen3-30B-A3B-Instruct-2507"
# Tavily API
TAVILY_API_KEY = "your_api_key"
# Bocha API
BOCHA_WEB_SEARCH_API_KEY = "your_api_key"8. Initialize the Database:
cd MindSpider
python schema/init_database.py9. Start the main application.
Return to the project’s root directory and ensure your Conda environment is active.
# Make sure you are in the 'BettaFish' root directory
python app.pyOpen your web browser and navigate to http://localhost:5000 to access the web UI.
Keep in mind that data scraping is a separate process. You will need to run the
MindSpidercrawler to populate your database with social media data before you can analyze it with the InsightEngine. You can find detailed instructions for the crawler in its dedicatedREADME.mdfile.
Pros
- Deep, Comprehensive Analysis: The multi-agent, multi-platform approach delivers a level of insight that is difficult to achieve with a single tool.
- Highly Customizable: As an open-source Python project, nearly every component can be modified, from the LLM provider to the sentiment analysis models.
- No Framework Lock-in: By being built from scratch, the tool avoids the limitations and dependencies of external frameworks.
- Innovative Collaboration Mechanism: The “Forum” concept is a clever solution to mitigate the risk of AI groupthink and generate more robust conclusions.
- Free and Open-Source: The software is free to use, modify, and inspect, with no licensing fees.
Cons
- Complex Setup: This is not a simple tool. It requires a solid understanding of Python, Conda, and configuring API keys and databases.
- Requires Your Own API Keys: You must supply your own LLM API keys, which means you will incur costs based on your usage.
- Data Scraping Responsibility: The user is solely responsible for using the web scraping features ethically and in compliance with the terms of service of the target websites.
FAQs
Q: How much technical expertise is needed to set up BettaFish?
A: You’ll need intermediate Python skills, experience with Conda environments, and basic database management knowledge. The setup process isn’t beginner-friendly but the documentation provides clear steps for developers.
Q: What’s the cost of running BettaFish?
A: The software itself is free and open source, but you’ll need to budget for LLM API costs, computational resources for processing, and potentially the cloud database service if you don’t maintain your own infrastructure.
Q: How current is the data analysis?
A: The system can operate in real-time with continuous monitoring, though the actual freshness depends on your crawling configuration. In my testing, data from the past 1-2 hours appears in analyses when running continuous collection.










