Skip to content

Basic Extraction

LangExtract works seamlessly with unstructured text to extract structured data.

Key Concepts

  • Extraction: Converting free text into structured objects.
  • Prompt: The instructions given to the LLM to define what to extract.

Code Example

python
import langextract as lx

# 1. Input Text (English)
input_text = """
New York City is the most populous city in the United States, with a population of about 8.3 million.
The mayor is Eric Adams.
"""

# 2. Prompt Description
prompt_description = """
Extract the following information:
- City name
- Population
- Mayor

Extract exactly from the text.
"""

# 3. Define example data
examples = [
    lx.data.ExampleData(
        text="Los Angeles has a population of about 3.8 million, and the mayor is Karen Bass.",
        extractions=[
            lx.data.Extraction(extraction_class="city", extraction_text="Los Angeles"),
            lx.data.Extraction(extraction_class="population", extraction_text="about 3.8 million"),
            lx.data.Extraction(extraction_class="mayor", extraction_text="Karen Bass")
        ]
    )
]

# 4. Run the extraction
result = lx.extract(
    text_or_documents=input_text,
    prompt_description=prompt_description,
    examples=examples,
    model_id="gemini-2.5-flash",
)

# 5. Display results
print("Extracted entities:")
for entity in result.extractions:
    print(f"• {entity.extraction_class}: {entity.extraction_text}")

Expected Output

Extracted entities:
• city: New York City
• population: about 8.3 million
• mayor: Eric Adams

Unofficial Guide. Not associated with Google.