> For the complete documentation index, see [llms.txt](https://docs.chaingpt.org/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.chaingpt.org/dev-docs-b2b-saas-api-and-sdk/solidity-llm-open-sourced.md).

# Solidity LLM (Open-Sourced)

## Solidity LLM (Open-Sourced)

Solidity LLM is a specialized Large Language Model (LLM) developed by ChainGPT, finely tuned to efficiently generate, understand, and analyze Solidity smart contracts. Designed explicitly for the decentralized development ecosystem, Solidity LLM delivers exceptional results, significantly outperforming larger models in syntax accuracy (\~83% compilation success), gas optimization (\~72% efficiency), and adherence to established standards (\~65% OpenZeppelin compliance). By using Solidity LLM, developers achieve faster development cycles, reduced debugging time, and substantial cost savings.

<figure><img src="/files/p20Qwy9A2PS7tSVK90qG" alt=""><figcaption></figcaption></figure>

***

### Model Information

* Developer: [ChainGPT](https://chaingpt.org)
* License: MIT License
* Base Model: Salesforce/codegen-2B-multi

#### Key Technical Details

* Model Type: Causal Language Model (Code Generation)
* Tokenizer: GPT2Tokenizer
* Parameters: 2 Billion
* Transformer Layers: 32
* Context Length: 2048 tokens
* Data Type: bfloat16

#### Demo & Deployment

* [Hugging Face Model](https://huggingface.co/Chain-GPT/Solidity-LLM)
* [Interactive Demo](https://huggingface.co/spaces/Chain-GPT/ChainGPT-Solidity-LLM)

***

### Performance Benchmark

Solidity LLM was benchmarked against leading LLMs (GPT-4.5 Preview, GPT-4o mini, Qwen 2.5-Coder-7B, DeepSeek-Coder-7B). Key metrics included:

<figure><img src="/files/fUVxeoP0hxd76PD7EZtq" alt=""><figcaption></figcaption></figure>

| **Metric**               | **Solidity LLM** | **GPT-4.5** | **GPT-4o mini** | **Qwen** | **DeeapSeek** |
| ------------------------ | ---------------- | ----------- | --------------- | -------- | ------------- |
| Compilation Success Rate | 83%              | 50%         | 30%             | 20%      | 15%           |
| OpenZeppelin Compliance  | 65%              | 75%         | 70%             | 50%      | 40%           |
| Gas Efficiency           | 72%              | 68%         | 70%             | 60%      | 55%           |
| Security Posture         | 58%              | 70%         | 65%             | 55%      | 50%           |
| Line-of-Code Efficiency  | 70%              | 68%         | 69%             | 60%      | 58%           |

*These benchmarks reflect Solidity LLM’s exceptional efficiency, accuracy, and cost-effectiveness.*

***

### Use Cases

#### Direct Use

* Smart contract development assistance
* Solidity educational resources
* Documentation and template creation

#### Downstream Applications

* Integrated Development Environments (IDEs)
* Autonomous blockchain agents

#### Out-of-Scope Uses

* General-purpose coding (other languages)
* Legal auditing or formal verification without human oversight
* Production deployment without manual review

***

### Risks, Biases, and Limitations

* Possible biases from training datasets
* Occasional hallucinations or logically incorrect outputs
* Caution required in financial or high-stakes scenarios

Recommendation: Always conduct manual code reviews and thorough testing before deploying generated code.

***

### Getting Started

#### Requirements

```
pip install transformers==4.51.3 torch==2.7.0 accelerate==1.6.0
```

#### Basic Usage

```
from transformers import AutoModelForCausalLM, AutoTokenizer

modelpath = "Chain-GPT/Solidity-LLM"
tokenizer = AutoTokenizer.from_pretrained(modelpath)
model = AutoModelForCausalLM.from_pretrained(modelpath).to("cuda")

prompt = "Write a Solidity function to transfer tokens."
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

# Generate Solidity code
outputs = model.generate(**inputs, max_new_tokens=1400, pad_token_id=tokenizer.eos_token_id)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(generated_text)
```

#### Streaming Mode (Direct Code Generation)

```
import torch
from threading import Thread
from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer

model = AutoModelForCausalLM.from_pretrained(
    "Chain-GPT/Solidity-LLM",
    torch_dtype=torch.bfloat16,
    device_map="cuda"
)
tokenizer = AutoTokenizer.from_pretrained("Chain-GPT/Solidity-LLM")

prompt = "Develop a Solidity Contract for a lottery requiring 1 ETH for registration with a 10 ETH reward."
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
streamer = TextIteratorStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

Thread(target=model.generate, kwargs={
    "input_ids": inputs["input_ids"],
    "max_new_tokens": 1800,
    "temperature": 0.7,
    "do_sample": True,
    "streamer": streamer
}).start()

for chunk in streamer:
    print(chunk, end="", flush=True)
```

***

### Training Details

* Compute Resources: 80 GB GPU cluster (4 GPUs)
* Training Duration: \~1095 hours (1.5 months)
* Pre-training: 1 billion tokens of raw Solidity data
* Fine-tuning dataset:
  * Solidity version ≥ 0.5
  * 200-4000 tokens per contract
  * 650,000 curated, deduplicated instructions

***

### Future Roadmap

<table><thead><tr><th width="105.046875">Priority</th><th width="372.4921875">Feature</th><th>Timeline</th></tr></thead><tbody><tr><td>High</td><td>Enhanced Solidity &#x26; OpenZeppelin support</td><td>Q3 2025</td></tr><tr><td>Medium</td><td>In-line code editing tools</td><td>Q4 2025</td></tr><tr><td>Medium</td><td>Expanded compatibility (e.g., Rust for Solana)</td><td>Q1 2026</td></tr><tr><td>Low</td><td>Increased context capacity</td><td>Q2 2026</td></tr></tbody></table>

***

### Community & Support

* HuggingFace: <https://huggingface.co/Chain-GPT/Solidity-LLM>
* [Discord Community](https://discord.gg/chaingpt)

***

### Conclusion

Solidity LLM by ChainGPT empowers Web3 developers with a reliable, high-performance model explicitly crafted for Solidity smart contract generation, combining robust technical performance with tangible business impact.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.chaingpt.org/dev-docs-b2b-saas-api-and-sdk/solidity-llm-open-sourced.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
