crawl4ai
crawl4ai copied to clipboard
Doubt
How can we make it run with AZURE_OPENAI_API_KEY instead of OPENAI_API_KEY
Since it is using LiteLLM in the background, you can try using the variables which lileLLM accepts.
For env variables:
os.environ["AZURE_API_KEY"] = "" # "my-azure-api-key"
os.environ["AZURE_API_BASE"] = "" # "https://example-endpoint.openai.azure.com"
os.environ["AZURE_API_VERSION"] = "" # "2023-05-15"
For model:
model = "azure/<your_deployment_name>",
Let's say you already created a deployment with name gpt-4o-mini, and the following code is the way that you can pass. In this example, we define a simple knowledge graph and we want to turn the content of one of the Paul Graham essays and extract a knowledge graph out of it. I assume you have already set your Azure API key and base and API version and there's nothing much left to do. You can simply pass the following code.
os.environ["AZURE_API_KEY"] = "" # "my-azure-api-key"
os.environ["AZURE_API_BASE"] = "" # "https://example-endpoint.openai.azure.com"
os.environ["AZURE_API_VERSION"] = "" # "2023-05-15"
async def main():
class Entity(BaseModel):
name: str
description: str
class Relationship(BaseModel):
entity1: Entity
entity2: Entity
description: str
relation_type: str
class KnowledgeGraph(BaseModel):
entities: List[Entity]
relationships: List[Relationship]
extraction_strategy = LLMExtractionStrategy(
provider = "azure/gpt-4o-mini",
api_base=os.environ["AZURE_API_BASE"],
api_token=os.environ["AZURE_API_KEY"],
schema=KnowledgeGraph.model_json_schema(),
extraction_type="schema",
instruction="""Extract entities and relationships from the given text."""
)
async with AsyncWebCrawler() as crawler:
url = "https://paulgraham.com/love.html"
result = await crawler.arun(
url=url,
bypass_cache=True,
extraction_strategy=extraction_strategy,
)
# print(result.extracted_content)
with open(os.path.join(__data__, "kb_test.json"), "w") as f:
f.write(result.extracted_content)
print("Done")
Thanks it worked :)