Basic Extraction
LangExtract works seamlessly with unstructured text to extract structured data.
Key Concepts
- Extraction: Converting free text into structured objects.
- Prompt: The instructions given to the LLM to define what to extract.
Code Example
python
import langextract as lx
# 1. Input Text (English)
input_text = """
New York City is the most populous city in the United States, with a population of about 8.3 million.
The mayor is Eric Adams.
"""
# 2. Prompt Description
prompt_description = """
Extract the following information:
- City name
- Population
- Mayor
Extract exactly from the text.
"""
# 3. Define example data
examples = [
lx.data.ExampleData(
text="Los Angeles has a population of about 3.8 million, and the mayor is Karen Bass.",
extractions=[
lx.data.Extraction(extraction_class="city", extraction_text="Los Angeles"),
lx.data.Extraction(extraction_class="population", extraction_text="about 3.8 million"),
lx.data.Extraction(extraction_class="mayor", extraction_text="Karen Bass")
]
)
]
# 4. Run the extraction
result = lx.extract(
text_or_documents=input_text,
prompt_description=prompt_description,
examples=examples,
model_id="gemini-2.5-flash",
)
# 5. Display results
print("Extracted entities:")
for entity in result.extractions:
print(f"• {entity.extraction_class}: {entity.extraction_text}")Expected Output
Extracted entities:
• city: New York City
• population: about 8.3 million
• mayor: Eric Adams