微软最新开源大模型phi3.5全面解析:从MoE到vision,一站式掌握未来AI技术,轻松构建智能应用

AI超元域
17 min readAug 24, 2024

--

🔥🔥🔥本篇笔记所对应的视频 https://youtu.be/IE6rGV4ek7I

NVIDIA NIM | phi-3_5-moe

microsoft/Phi-3.5-MoE-instruct · Hugging Face

Google Colab

算法测试

Longest Palindromic Substring — LeetCode

Merge k Sorted Lists — LeetCode

Substring with Concatenation of All Words — LeetCode

NIM代码

export NVIDIA_API_KEY=nvapi-9kPgR5Yz28
from openai import OpenAI
import os
client = OpenAI(
base_url = "<https://integrate.api.nvidia.com/v1>",
api_key = os.getenv("NVIDIA_API_KEY")
)
completion = client.chat.completions.create(
model="microsoft/phi-3.5-moe-instruct",
messages=[{"role":"user","content":"What is the latest version of Python?"}],
temperature=0.2,
top_p=0.7,
max_tokens=1024,
stream=True
)
for chunk in completion:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="")

👉👉👉如有问题/项目合作请联系我的徽信 stoeng

🔥🔥🔥本项目代码由AI超元域频道制作,观看更多大模型微调视频请访问我的频道⬇

👉👉👉我的哔哩哔哩频道

👉👉👉我的YouTube频道

**👉👉👉我的开源项目 https://github.com/win4r/AISuperDomain**

MoE UI

import os
from openai import OpenAI as AsyncOpenAI
import chainlit as cl
api_key = os.environ.get("NVIDIA_API_KEY")
client = AsyncOpenAI(base_url="<https://integrate.api.nvidia.com/v1>", api_key=api_key)
cl.instrument_openai()
settings = {
"model": "microsoft/phi-3.5-moe-instruct",
"temperature": 0.5,
"top_p": 0.7,
"max_tokens": 1024,
}
@cl.on_chat_start
def start_chat():
cl.user_session.set(
"message_history",
[{"role": "system", "content": "You are a helpful assistant."}],
)
async def async_generator(sync_gen):
for item in sync_gen:
yield item
@cl.on_message
async def run_conversation(message: cl.Message):
message_history = cl.user_session.get("message_history")
message_history.append({"role": "user", "content": message.content})
msg = cl.Message(content="")
await msg.send()
stream = client.chat.completions.create(
messages=message_history, stream=True, **settings
)
async for part in async_generator(stream):
if token := part.choices[0].delta.content or "":
await msg.stream_token(token)
message_history.append({"role": "assistant", "content": msg.content})
await msg.update()

vision

import requests
import base64
import json
invoke_url = "<https://integrate.api.nvidia.com/v1/chat/completions>"
stream = False # Changed to False to get a single JSON response
with open("./dog.jpeg", "rb") as f:
image_b64 = base64.b64encode(f.read()).decode()
assert len(image_b64) < 180_000, \\
"To upload larger images, use the assets API (see docs)"
headers = {
"Authorization": "Bearer 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",
"Accept": "application/json" # Changed to accept JSON
}
payload = {
"model": 'microsoft/phi-3.5-vision-instruct',
"messages": [
{
"role": "user",
"content": f'Describe the image. <img src="data:image/jpeg;base64,{image_b64}" />'
}
],
"max_tokens": 512,
"temperature": 0.20,
"top_p": 0.70,
"stream": stream
}
response = requests.post(invoke_url, headers=headers, json=payload)# Parse the JSON response and print only the content
if response.status_code == 200:
response_json = response.json()
description = response_json['choices'][0]['message']['content']
print(description)
else:
print(f"Error: {response.status_code}")
print(response.text)

vision-UI

import chainlit as cl
import requests
import base64
import os
# Function to get image description from NVIDIA API
async def get_image_description(image_path, api_key):
invoke_url = "<https://integrate.api.nvidia.com/v1/chat/completions>"
with open(image_path, "rb") as f:
image_b64 = base64.b64encode(f.read()).decode()
assert len(image_b64) < 180_000, \\
"To upload larger images, use the assets API (see docs)"
headers = {
"Authorization": f"Bearer {api_key}",
"Accept": "application/json"
}
payload = {
"model": 'microsoft/phi-3.5-vision-instruct',
"messages": [
{
"role": "user",
"content": f'Describe the image. <img src="data:image/jpeg;base64,{image_b64}" />'
}
],
"max_tokens": 512,
"temperature": 0.20,
"top_p": 0.70,
"stream": False
}
response = requests.post(invoke_url, headers=headers, json=payload) if response.status_code == 200:
response_json = response.json()
description = response_json['choices'][0]['message']['content']
return description
else:
return f"Error: {response.status_code}\\n{response.text}"
@cl.on_chat_start
async def start():
await cl.Message(content="Welcome! Please upload an image to get a description.").send()
@cl.on_message
async def on_message(msg: cl.Message):
if not msg.elements:
await cl.Message(content="No file attached. Please upload an image.").send()
return
# Processing images exclusively
images = [file for file in msg.elements if "image" in file.mime]
if not images:
await cl.Message(content="Please upload an image file.").send()
return
# Get the API key from environment variable
api_key = os.getenv('NVIDIA_API_KEY')
if not api_key:
await cl.Message(content="Error: NVIDIA API key not found in environment variables.").send()
return
# Process the first image
image_path = images[0].path
# Get image description
description = await get_image_description(image_path, api_key)
# Send the description back to the user
await cl.Message(content=f"Image Description: {description}").send()
# Optional: display the processed image
with open(image_path, "rb") as f:
image_content = f.read()
elements = [
cl.Image(name="processed_image", content=image_content, mime=images[0].mime)
]
await cl.Message(content="Processed Image:", elements=elements).send()
await cl.Message(content=f"Processed {len(images)} image(s)").send()if __name__ == "__main__":
cl.run()

autogen studio

autogenstudio ui --port 8081

skills代码

#Name duck_duck_go
#Description Search the web using DuckDuckGo.
from duckduckgo_search import DDGSdef search_duckduckgo(query, region='wt-wt', safesearch='off', max_results=5):
"""Search DuckDuckGo for the given query and return the results."""
ddg = DDGS()
results = ddg.text(keywords=query, region=region, safesearch=safesearch, max_results=max_results)

for result in results:
print(f"Title: {result['title']}")
print(f"URL: {result['href']}")
print(f"Snippet: {result['body']}\\n")

return results
# Test the function
if __name__ == "__main__":
query = "Web scraping with Python"
search_duckduckgo(query)
# Example usage for autogen agent
# Create a new Python script (e.g., execute_search.py) and import the function:
# from skills import search_duckduckgo
# query = "autogenstudio"
# search_duckduckgo(query)

autogen prompt

You are a versatile AI assistant specialized in using the DuckDuckGo search API to gather information, refine search results, and transform them into engaging news articles. Your tasks include performing searches, analyzing results, and crafting informative news pieces based on the gathered information.
When given a task or query:
Use the search_duckduckgo function to gather initial information. Suggest Python code to execute this function, providing the necessary parameters (query, region, safesearch, max_results).
After receiving the search results, analyze the titles, URLs, and snippets. If the information is insufficient or unclear, perform additional searches with refined queries.
Synthesize and enhance the search results:
Summarize key points from the search results
Organize and explain the relevance of the sources
Identify information gaps and suggest ways to fill them
Transform the synthesized information into a well-structured news article:Create an attention-grabbing headline
Write a concise lead paragraph summarizing key information (who, what, when, where, why, and how)
Develop the story with supporting details, quotes, and context
Include relevant background information
Ensure journalistic standards are met:Maintain objectivity and balance
Distinguish between facts and opinions
Use proper attribution for quotes and claims
Cross-reference information from multiple sources when available
Enhance readability and engagement:Use clear, concise language for a general audience
Break up long paragraphs and use subheadings where appropriate
Include relevant statistics or data points
Add a brief "Key Takeaways" section for complex topics
If additional code execution is needed (e.g., to perform follow-up searches or process results), suggest the appropriate Python code in a code block, using the search_duckduckgo function.
Present your final news article in a clear, structured format using Markdown.
Always use Python code blocks for any code that needs to be executed. Include comments in your code to explain what each part does. If you need to save any code or results to a file, include # filename: <filename> as the first line in the code block.
Do not suggest incomplete code or ask the user to modify the code. The user can only execute the code you provide without modifications.
If there are any errors in the code execution, analyze the error message, fix the issue, and provide the corrected full code block.
Continue this process of searching, analyzing, refining, and writing until you have produced a comprehensive and engaging news article based on the original query or task. Verify your final article carefully and include verifiable evidence where possible.
Suggest a filename for the article using the format: "YYYYMMDD_headline_keywords.md"
Reply 'ARTICLE_COMPLETE' when you have finished writing and editing the news article based on the search results.

ollama

ollama run phi3.5:latest

--

--

No responses yet