jhleepidl commited on
Commit
ee9d1b7
Β·
verified Β·
1 Parent(s): 5d2ccf3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -73
README.md CHANGED
@@ -8,7 +8,6 @@ tags:
8
  - dynamic-scaling
9
  - api-retrieval
10
  - tool-discovery
11
- - library-management
12
  license: mit
13
  ---
14
 
@@ -20,29 +19,24 @@ A sophisticated query embedding model designed to act as an intelligent "librari
20
 
21
  This model serves as a **Librarian of Tools** - an AI system that understands user intentions and finds the most appropriate tools, APIs, or functions to accomplish their tasks. It's particularly effective for:
22
 
23
- - **API Discovery**: Finding relevant APIs from large collections
24
- - **Tool Recommendation**: Suggesting appropriate tools for specific tasks
25
- - **Function Retrieval**: Matching queries to available functions or methods
26
- - **Library Management**: Organizing and retrieving from tool libraries
27
  - **Iterative Search**: Progressive tool discovery with residual-based refinement
28
 
29
  ## πŸ—οΈ Model Architecture
30
 
31
  - **Base Model**: ToolBench/ToolBench_IR_bert_based_uncased
32
- - **Architecture**: Dynamic direction-focused query embedding with scale prediction
33
  - **Special Features**:
34
- - **Dynamic Scale Prediction**: Each query gets its own optimal scaling factor
35
- - **Direction-Focused Training**: Prioritizes semantic alignment over magnitude
36
  - **Query-Specific Adaptation**: Tailors embeddings to individual query characteristics
37
  - **Balanced Magnitude Handling**: Maintains appropriate scaling for retrieval tasks
38
- - **Residual-Based Iteration**: Supports iterative greedy search for comprehensive tool discovery
39
 
40
  ## πŸŽ“ Training Strategy
41
 
42
  - **Training Approach**: Dynamic direction-focused with AdamW optimizer
43
  - **Loss Function**: Combined MSE, direction loss, and magnitude loss
44
- - **Scale Prediction**: Uses softplus + 1 activation for 1+ scale factors
45
- - **Dataset**: Trained on normalized query-API pairs for optimal tool retrieval
46
 
47
  ## πŸš€ Usage
48
 
@@ -93,9 +87,9 @@ print("Query similarity matrix:")
93
  print(similarity_matrix)
94
  ```
95
 
96
- ### Iterative Greedy Search for Comprehensive Tool Discovery
97
 
98
- The model supports iterative greedy search, which progressively discovers tools by removing found APIs from the query representation and continuing the search:
99
 
100
  ```python
101
  import torch
@@ -103,14 +97,14 @@ import torch.nn.functional as F
103
  import numpy as np
104
 
105
  class LibrarianSearch:
106
- def __init__(self, model, tokenizer, vector_db_index, documents, threshold=0.6):
107
  self.model = model
108
  self.tokenizer = tokenizer
109
  self.index = vector_db_index
110
  self.documents = documents
111
  self.threshold = threshold
112
 
113
- def get_query_embedding(self, query, normalize=True):
114
  """Get query embedding using the Librarian model"""
115
  inputs = self.tokenizer(
116
  query,
@@ -245,77 +239,23 @@ def beam_search_iterative(self, query, beam_size=5):
245
  pass
246
  ```
247
 
248
- ## 🎯 Use Cases
249
-
250
- ### 1. API Discovery
251
- ```python
252
- # Find APIs for specific tasks
253
- query = "Convert image to different formats"
254
- # Model will help find relevant image processing APIs
255
- ```
256
-
257
- ### 2. Tool Recommendation
258
- ```python
259
- # Recommend tools for data analysis
260
- query = "Analyze time series data and create visualizations"
261
- # Model will suggest appropriate data analysis tools
262
- ```
263
-
264
- ### 3. Function Retrieval
265
- ```python
266
- # Find functions in code libraries
267
- query = "Calculate distance between two geographic coordinates"
268
- # Model will help locate relevant geospatial functions
269
- ```
270
-
271
- ### 4. Multi-Tool Discovery
272
- ```python
273
- # Find multiple tools for complex workflows
274
- query = "Send email notifications, process uploaded files, and generate reports"
275
- # Iterative search will find email, file processing, and reporting tools
276
- ```
277
-
278
- ## πŸ“Š Performance
279
-
280
- This model excels at:
281
- - **Semantic Understanding**: Captures nuanced differences between similar tool requests
282
- - **Direction Alignment**: Ensures embeddings point in the right semantic direction
283
- - **Magnitude Optimization**: Maintains appropriate scaling for retrieval systems
284
- - **Query Adaptation**: Tailors responses to specific query characteristics
285
- - **Iterative Discovery**: Progressively finds multiple relevant tools through residual-based search
286
- - **Beam Search Optimization**: Finds optimal combinations of tools for complex queries
287
-
288
- ## πŸ”§ Integration
289
-
290
- The model is designed to work seamlessly with:
291
- - Vector databases (FAISS, Pinecone, Weaviate)
292
- - Retrieval systems
293
- - Recommendation engines
294
- - Tool discovery platforms
295
- - API marketplaces
296
- - Iterative search frameworks
297
-
298
  ## πŸ“š Citation
299
 
300
  If you use this model in your research or applications, please cite:
301
 
302
  ```bibtex
303
  @misc{librarian_of_tools,
304
- title={Librarian of Tools: Dynamic Direction-Focused Query Embedding for Tool Discovery},
305
  author={jhleepidl},
306
- year={2024},
307
- url={https://huggingface.co/jhleepidl/librarian}
308
  }
309
  ```
310
 
311
- ## 🀝 Contributing
312
-
313
- This model is part of a larger effort to improve tool discovery and API retrieval. Contributions and feedback are welcome!
314
-
315
  ## πŸ“„ License
316
 
317
  This model is released under the MIT License, making it suitable for both research and commercial applications.
318
 
319
  ---
320
 
321
- **The Librarian of Tools** - Your intelligent assistant for discovering the right tools for any task! πŸ› οΈπŸ“š
 
8
  - dynamic-scaling
9
  - api-retrieval
10
  - tool-discovery
 
11
  license: mit
12
  ---
13
 
 
19
 
20
  This model serves as a **Librarian of Tools** - an AI system that understands user intentions and finds the most appropriate tools, APIs, or functions to accomplish their tasks. It's particularly effective for:
21
 
22
+ - **API Discovery**: Finding relevant rapid APIs from large collections
 
 
 
23
  - **Iterative Search**: Progressive tool discovery with residual-based refinement
24
 
25
  ## πŸ—οΈ Model Architecture
26
 
27
  - **Base Model**: ToolBench/ToolBench_IR_bert_based_uncased
28
+ - **Architecture**: Query embedding with scale prediction
29
  - **Special Features**:
 
 
30
  - **Query-Specific Adaptation**: Tailors embeddings to individual query characteristics
31
  - **Balanced Magnitude Handling**: Maintains appropriate scaling for retrieval tasks
32
+ - **Residual-Based Iteration**: Supports iterative search for comprehensive tool discovery
33
 
34
  ## πŸŽ“ Training Strategy
35
 
36
  - **Training Approach**: Dynamic direction-focused with AdamW optimizer
37
  - **Loss Function**: Combined MSE, direction loss, and magnitude loss
38
+ - **Scale Prediction**: Uses softplus + 1 activation for 1 + scale factors
39
+ - **Dataset**: Trained on query-(sum vector of relevant API embeddings) pairs from ToolBench(https://github.com/OpenBMB/ToolBench)
40
 
41
  ## πŸš€ Usage
42
 
 
87
  print(similarity_matrix)
88
  ```
89
 
90
+ ### Iterative Search for Comprehensive Tool Discovery
91
 
92
+ The model supports iterative search, which progressively discovers tools by removing found APIs from the query representation and continuing the search:
93
 
94
  ```python
95
  import torch
 
97
  import numpy as np
98
 
99
  class LibrarianSearch:
100
+ def __init__(self, model, tokenizer, vector_db_index, documents, threshold=0.5):
101
  self.model = model
102
  self.tokenizer = tokenizer
103
  self.index = vector_db_index
104
  self.documents = documents
105
  self.threshold = threshold
106
 
107
+ def get_query_embedding(self, query, normalize=False):
108
  """Get query embedding using the Librarian model"""
109
  inputs = self.tokenizer(
110
  query,
 
239
  pass
240
  ```
241
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
242
  ## πŸ“š Citation
243
 
244
  If you use this model in your research or applications, please cite:
245
 
246
  ```bibtex
247
  @misc{librarian_of_tools,
248
+ title={Librarian of Tools},
249
  author={jhleepidl},
250
+ year={2025},
251
+ url={https://github.com/jhleepidl/librarian}
252
  }
253
  ```
254
 
 
 
 
 
255
  ## πŸ“„ License
256
 
257
  This model is released under the MIT License, making it suitable for both research and commercial applications.
258
 
259
  ---
260
 
261
+ **The Librarian of Tools** - Your intelligent assistant for discovering the right tools for any task! πŸ› οΈπŸ“š