Real Estate Domain Translator (Phi-3)

This is a fine-tuned Phi-3-mini model specialized for translating natural language queries into domain-aware Snowflake SQL for real estate investment analysis.

Model Details

  • Base Model: microsoft/Phi-3-mini-4k-instruct
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Domain: Real Estate Investment Analysis
  • Training Data: 154 examples extracted from real estate query patterns
  • Training Loss Improvement: 84% (1.62 โ†’ 0.26)

Capabilities

The model understands real estate domain terminology including:

  • Markets: Atlanta, Dallas, Houston, Charlotte, Orlando, etc.
  • Disposition Categories: Project Skyline, DQ, DiMaggio Kicks, etc.
  • Rental Metrics: CCMModelRent, CMOR1mFwd, TenantRent, ModelRent, IndexRent
  • Property Attributes: SQFT, Year Built, Beds, Baths, UnitStatus, DailyStatus
  • Valuation Metrics: CoreLogicAVM, PricePercentile50, XGBPrice, BayesPriceMean
  • Financial Metrics: GrossPrice, NetPrice, RegularBPO, QuickSaleBPO

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load model
base_model = AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct", trust_remote_code=True)
model = PeftModel.from_pretrained(base_model, "bgarvey1/real-estate-domain-translator-phi3")
tokenizer = AutoTokenizer.from_pretrained("bgarvey1/real-estate-domain-translator-phi3")

# Generate SQL
prompt = "### Domain: Real Estate Analysis\n### Task: Translate to SQL\n### Input: Show me the top 5 markets by unit count\n### Output:\n"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)

Example Translations

Input: "Show me the top 5 markets by unit count" Output: SELECT "Market", SUM("Count") as unit_count FROM FKH_ML_DATA.ML_DATA.STATIC_TABLE_CACHE GROUP BY "Market" ORDER BY unit_count DESC LIMIT 5

Input: "List Project Skyline properties that are listed"
Output: SELECT "DispoCategory", "DispoStatus" FROM FKH_ML_DATA.ML_DATA.DISPO_CACHE WHERE "DispoCategory" = 'Project Skyline' AND "DispoStatus" = 'Listed'

Training Details

  • Training Steps: 186 steps across 3 epochs
  • LoRA Configuration: r=16, alpha=32, dropout=0.1
  • Target Modules: q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj
  • Training Data: Real estate domain queries with business logic patterns
  • Validation: Domain-specific evaluation on real estate terminology

This model serves as an intelligent translator between general language models and real estate domain-specific terminology and business logic.

Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for bgarvey1/real-estate-domain-translator-phi3

Adapter
(786)
this model