• About
  • Privacy Policy
  • Disclaimer
  • Contact
Soft Bliss Academy
No Result
View All Result
  • Home
  • Artificial Intelligence
  • Software Development
  • Machine Learning
  • Research & Academia
  • Startups
  • Home
  • Artificial Intelligence
  • Software Development
  • Machine Learning
  • Research & Academia
  • Startups
Soft Bliss Academy
No Result
View All Result
Home Machine Learning

Turning PDFs into Structured Intelligence with Generative AI: My Kaggle Capstone Experience | by amir tabatabaei | Apr, 2025

softbliss by softbliss
April 9, 2025
in Machine Learning
0
Turning PDFs into Structured Intelligence with Generative AI: My Kaggle Capstone Experience | by amir tabatabaei | Apr, 2025
0
SHARES
1
VIEWS
Share on FacebookShare on Twitter


amir tabatabaei

PDFs are everywhere in business invoices, tax forms, legal contracts. But when it comes to automating or analyzing them, they become a nightmare. They’re unstructured, unpredictable, and not built for machines.

For my Kaggle Generative AI Capstone Project, I set out to change that. I built a smart document assistant powered by Google’s Gemini, vector search, and retrieval-augmented generation (RAG). It classifies documents, extracts structured JSON from them, and even answers natural language questions like “Who paid the most tax?” or “List property contracts over $500K.”

To simulate a real-world document processing scenario, The dataset contains 15 synthetically generated documents:

• 🧾 Invoices

• 📄 Tax returns

• 🏡 Property sale contracts

These were handcrafted using Jinja templates and exported as PDFs with randomized values to preserve realism while avoiding private data.

Here’s the breakdown:

| Document Type       | Count | Description                                  |
|---------------------|-------|----------------------------------------------|
| Invoices | 5 | Billing statements with totals and client info |
| Tax Returns | 5 | U.S. 1040-style income and deductions |
| Property Contracts | 5 | Buyer/seller agreements and sale prices |
{
"invoice_number": "INV-2024-100",
"client_name": "Daniel Lee",
"items": [
{"name": "Consulting Services", "qty": 3, "unit_price": 185, "total": 555},
{"name": "Support Hours", "qty": 2, "unit_price": 183, "total": 366}
],
"subtotal": 921,
"tax": 92.1,
"total": 1013.1
}

Before structured data can be extracted, the system must first determine the type of document being processed.

This is achieved using few-shot prompting, where the model is given a few labeled examples such as an invoice, a tax return, and a property contract and is then asked to classify new, unseen documents based on their content.

This approach enables classification without any model fine-tuning, relying solely on the model’s general language understanding and a few representative examples.

Below is an example of the prompt used for classification:

Classify the type of the following document as one of the following: 
invoice, tax_return, property_contract.

Document:
INVOICE
Invoice Number: INV-2024-100
Date: 2025-04-05
Due Date: 2025-05-05
Billed To: Daniel Lee...

Type: invoice

Document:
U.S. Individual Income Tax Return
Taxpayer: Alex Miller
Wages: $40,000...

Type: tax_return

Document:
[Insert new document here...]

Type:

In response to the prompt, Gemini correctly identified the document as an invoice. The model made this decision by recognizing keywords such as “Invoice Number”, “Billed To”, and monetary line items — all indicative of a typical billing document.

Predicted Type:

“invoice”

This classification method generalizes well across varied layouts and content styles, making it highly scalable for processing large volumes of business documents without requiring any labeled training data.

Tags: amirAprCapstoneExperienceGenerativeIntelligenceKagglePDFsStructuredtabatabaeiTurning
Previous Post

4 Growth Processes AI Could Help Startups Optimize

Next Post

Why AI Needs Large Numerical Models (LNMs) for Mathematical Mastery • AI Blog

softbliss

softbliss

Related Posts

Machine Learning

Beyond Text Compression: Evaluating Tokenizers Across Scales

by softbliss
June 5, 2025
Teaching AI models the broad strokes to sketch more like humans do | MIT News
Machine Learning

Teaching AI models the broad strokes to sketch more like humans do | MIT News

by softbliss
June 4, 2025
NotebookLM introduces public notebooks for sharing
Machine Learning

NotebookLM introduces public notebooks for sharing

by softbliss
June 4, 2025
8 FREE Platforms to Host Machine Learning Models
Machine Learning

8 FREE Platforms to Host Machine Learning Models

by softbliss
June 4, 2025
RLHF 101: A Technical Tutorial on Reinforcement Learning from Human Feedback – Machine Learning Blog | ML@CMU
Machine Learning

RLHF 101: A Technical Tutorial on Reinforcement Learning from Human Feedback – Machine Learning Blog | ML@CMU

by softbliss
June 3, 2025
Next Post
Why AI Needs Large Numerical Models (LNMs) for Mathematical Mastery • AI Blog

Why AI Needs Large Numerical Models (LNMs) for Mathematical Mastery • AI Blog

Premium Content

Feeling Pressure to Invest in AI? Good—You Should Be

Feeling Pressure to Invest in AI? Good—You Should Be

May 2, 2025

4 Growth Processes AI Could Help Startups Optimize

April 9, 2025
Raise Series A/B Funding with EIT Digital Champions 2025!

Raise Series A/B Funding with EIT Digital Champions 2025!

May 10, 2025

Browse by Category

  • Artificial Intelligence
  • Machine Learning
  • Research & Academia
  • Software Development
  • Startups

Browse by Tags

Amazon API App Artificial Blog Build Building Business Data Development Digital Framework Future Gemini Generative Google Guide Impact Intelligence Key Language Large Learning LLM LLMs Machine Microsoft MIT model Models News NVIDIA Official opinion OReilly Research Science Series Software Startup Startups students Tech Tools Video

Soft Bliss Academy

Welcome to SoftBliss Academy, your go-to source for the latest news, insights, and resources on Artificial Intelligence (AI), Software Development, Machine Learning, Startups, and Research & Academia. We are passionate about exploring the ever-evolving world of technology and providing valuable content for developers, AI enthusiasts, entrepreneurs, and anyone interested in the future of innovation.

Categories

  • Artificial Intelligence
  • Machine Learning
  • Research & Academia
  • Software Development
  • Startups

Recent Posts

  • Gemini 2.5’s native audio capabilities
  • Beyond Text Compression: Evaluating Tokenizers Across Scales
  • Stuck with AI App Builders Like Replit? Get Expert Help to Finish Your App

© 2025 https://softblissacademy.online/- All Rights Reserved

No Result
View All Result
  • Home
  • Artificial Intelligence
  • Software Development
  • Machine Learning
  • Research & Academia
  • Startups

© 2025 https://softblissacademy.online/- All Rights Reserved

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?