Create README.md
Browse files- requirement_check/README.md +67 -0
requirement_check/README.md
ADDED
|
@@ -0,0 +1,67 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
pipeline_tag: text-generation
|
| 6 |
+
library_name: transformers
|
| 7 |
+
---
|
| 8 |
+
|
| 9 |
+
# Requirement Checker Adapters
|
| 10 |
+
|
| 11 |
+
|
| 12 |
+
|
| 13 |
+
## Model Summary
|
| 14 |
+
|
| 15 |
+
This **Requirement Checker** family of adapters are designed to check if specified requirements were satisfied by the last model generation. Only one requirement is checked at a time (multiple requirements can be checked with parallel model calls).
|
| 16 |
+
|
| 17 |
+
- **Developer:** IBM Research
|
| 18 |
+
- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
|
| 22 |
+
|
| 23 |
+
## Usage
|
| 24 |
+
|
| 25 |
+
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
| 26 |
+
|
| 27 |
+
### Intended use
|
| 28 |
+
|
| 29 |
+
**Usage steps** Given a generation task and a set of requirements:
|
| 30 |
+
|
| 31 |
+
1. Use the base model to generate a response as normal (via the `assistant` role), with the prompt describing the task followed by "Requirements:"" and the list of active requirements.
|
| 32 |
+
2. Repeat the requirement to be checked.
|
| 33 |
+
3. The Requirement Checker model will respond with "true" or "false", where "true" means the requirement is satisfied.
|
| 34 |
+
|
| 35 |
+
|
| 36 |
+
### Quickstart Example
|
| 37 |
+
|
| 38 |
+
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
## Evaluation
|
| 42 |
+
|
| 43 |
+
The model was evaluated on 200 rows of held-out synthetic data. Error rates are as follows:
|
| 44 |
+
|
| 45 |
+
**aLoRA models**
|
| 46 |
+
* Granite 3.3 2B: 6.0%
|
| 47 |
+
* Granite 3.3 8B: 5.75%
|
| 48 |
+
* GPT-OSS 20B: 5.75%
|
| 49 |
+
|
| 50 |
+
**LoRA models**
|
| 51 |
+
* Granite 3.3 2B: 4.5%
|
| 52 |
+
* Granite 3.3 8B: 4.0%
|
| 53 |
+
* GPT-OSS 20B: 4.0%
|
| 54 |
+
|
| 55 |
+
|
| 56 |
+
|
| 57 |
+
|
| 58 |
+
|
| 59 |
+
### Training Data
|
| 60 |
+
Synthetic data generated by Mixtral 8x22b and GPT-OSS 120B.
|
| 61 |
+
|
| 62 |
+
|
| 63 |
+
|
| 64 |
+
## Model Card Authors
|
| 65 |
+
|
| 66 |
+
Kristjan Greenewald
|
| 67 |
+
Bo Wu
|