🚀零成本复刻Deep Research!超越OpenAI Deep Research+DeepSeek R1!三分钟快速部署node-DeepResearch最强AI agent,由Jina AI打造!

14 min readFeb 6, 2025


Jina AI 开发的 node-DeepResearch 是一个开源自动化研究工具,旨在通过搜索、阅读网页和推理,直到找到问题的答案。该项目适用于自动化信息检索和智能问答任务,结合了搜索引擎和大语言模型(LLM)来提高研究效率。


3 minute read

🚀OpenAI Deep Research简介:

OpenAI最近推出了Deep Research功能,通过自动化的多步骤互联网研究任务,生成全面的报告。



Deep Research在Humanity’s Last Exam基准测试中取得了26.6%的得分,显示了其在处理复杂研究任务方面的能力。

目前,Deep Research已集成到ChatGPT界面中,供美国的Pro订阅用户使用。

所以我们使用开源的Jina AI node-DeepResearch替代方案来复现Deep Research。

🚀Jina AI node-DeepResearch 简介

Jina AI 开发的 node-DeepResearch 是一个开源自动化研究工具,旨在通过搜索、阅读网页和推理,直到找到问题的答案。该项目适用于自动化信息检索和智能问答任务,结合了搜索引擎和大语言模型(LLM)来提高研究效率。


  • 自动搜索和推理:使用搜索引擎查找相关网页,并通过大模型(如 Gemini)推理,生成最终答案。
  • 智能网页阅读:结合 Jina Reader,能够从网页提取关键内容,提升信息获取的精准度。
  • 可配置 API:用户可配置自己的 Gemini APIJina Reader API,以优化查询效果。

✅Jina AI node-DeepResearch GitHub仓库 https://github.com/jina-ai/node-DeepResearch



Windows用户只需要打开链接下载对应版本的Node.js安装包安装即可 链接:https://nodejs.org/en/download

Linux安装方式: curl -o- [https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.2/install.sh](https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.2/install.sh) | bash source ~/.bashrc nvm install --lts

macOS安装方式: /bin/bash -c "$(curl -fsSL [https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh](https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh))" brew install node

打开终端命令行,输入node -vnpm -v验证安装


打开链接下载对应的安装包进行安装即可 链接:https://git-scm.com/downloads 验证:git --version


打开链接下载对应版本的Node.js安装包安装即可 链接:https://www.python.org/downloads/ 验证:python3 --version

🚀API key申请:

✅Jina API: https://jina.ai/api-dashboard/key-manager

✅Gemini API: https://aistudio.google.com/prompts/new_chat


# Windows
set GEMINI_API_KEY=xxxxxx
set JINA_API_KEY=xxxxxx

git clone https://github.com/jina-ai/node-DeepResearch.git
cd node-DeepResearch
npm install


npm run dev "1+1="
npm run dev "what is the capital of France?"
npm run dev "9.9 vs 9.11"
npm run dev "How many R letters are in the word strawberry?"
npm run dev "The hyperparameter settings for fine-tuning Llama3?"
npm run dev "SpaceX的创始人是谁"
npm run dev "1+2+3+4+5+6+...+100="

# 运行服务
nohup npm run serve > output.log 2>&1 &

# 通过服务API来调用
curl -X POST http://localhost:3000/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"q": "9.9 vs 9.11",
"budget": 1000000,
"maxBadAttempt": 3
}' | jq -r .requestId | xargs -I {} curl -N http://localhost:3000/api/v1/stream/{}

🚀 chatbot代码

# 执行命令启动服务 nohup npm run serve > output.log 2>&1 &
# 安装所需依赖 pip install gradio requests

# 完整代码如下:
import gradio as gr
import requests
import json
import time
from typing import Generator, Tuple, List, Dict

def parse_sse_data(data: str) -> dict:
"""Parse SSE data string into a dictionary."""
if data.startswith("data: "):
return json.loads(data[6:]) # Remove "data: " prefix
except json.JSONDecodeError:
return {}
return {}

def get_final_answer(response_text: str) -> str:
"""Extract the final answer from the SSE stream."""
lines = response_text.strip().split('\n')
for line in reversed(lines): # Search from the end
parsed = parse_sse_data(line)
if parsed.get("type") == "answer":
return parsed.get("data", {}).get("answer", "No answer found")
return "No answer found"

def query_api(message: str) -> str:
"""Send query to API and get response."""
# First request to get requestId
session = requests.Session()

init_response = session.post(
"q": message,
"budget": 1000000,
"maxBadAttempt": 3
"Content-Type": "application/json"
timeout=60 # Increased timeout

request_id = init_response.json().get("requestId")

if not request_id:
return "Error: No request ID received"

# Stream the response with increased timeout
stream_response = session.get(
timeout=120, # Increased timeout for streaming
"Accept": "text/event-stream"


full_response = ""
for line in stream_response.iter_lines(decode_unicode=True):
if line:
full_response += line + '\n'
# Check if we've received the answer
if '"type":"answer"' in line:
parsed = parse_sse_data(line)
if parsed.get("type") == "answer":
answer = parsed.get("data", {}).get("answer")
if answer:
return answer

# If we haven't returned by now, try to extract answer from full response
answer = get_final_answer(full_response)
return answer if answer else "No answer found in response"

except requests.exceptions.Timeout:
return "Error: Request timed out. Please try again."
except requests.exceptions.RequestException as e:
return f"API Error: {str(e)}"
except Exception as e:
return f"Error: {str(e)}"

def format_message(role: str, content: str) -> Dict[str, str]:
"""Format message in the OpenAI-style format."""
return {"role": role, "content": content}

def chat_response(message: str, history: List[Dict[str, str]]) -> str:
"""Handle chat interaction and return response."""
response = query_api(message)
return response
except Exception as e:
return f"Error: {str(e)}"

# Create Gradio interface
with gr.Blocks(theme=gr.themes.Soft()) as demo:
gr.Markdown("""# AI Query Interface
Enter your question below to get an answer from the AI system.""")

chatbot = gr.Chatbot(
label="Chat History",

msg = gr.Textbox(
label="Your Question",
placeholder="Type your question here...",

with gr.Row():
clear = gr.Button("Clear Chat")
submit = gr.Button("Submit", variant="primary")

def user(user_message: str, history: List[Dict[str, str]]) -> Tuple[str, List[Dict[str, str]]]:
if not user_message.strip():
return "", history
user_msg = format_message("user", user_message)
return "", history

def bot(history: List[Dict[str, str]]) -> List[Dict[str, str]]:
user_message = history[-1]["content"]
bot_response = chat_response(user_message, history)
bot_msg = format_message("assistant", bot_response)
return history

# Set up event handlers
msg.submit(user, [msg, chatbot], [msg, chatbot], queue=False).then(
bot, chatbot, chatbot
submit.click(user, [msg, chatbot], [msg, chatbot], queue=False).then(
bot, chatbot, chatbot
clear.click(lambda: None, None, chatbot, queue=False)

if __name__ == "__main__":
# Launch the interface with public URL enabled



