GMI Cloud is happy to announce that we are hosting DeepSeek and its distilled models!
GMI Cloud is excited to announce that we are now hosting a dedicated DeepSeek-R1 inference endpoint, on optimized, US-based hardware.
What’s DeepSeek-R1? Read our initial takeaways here.
Technical details:
- Model Provider: DeepSeek
- Type: Chat
- Parameters: 685B
- Deployment: Serverless (MaaS) or Dedicated Endpoint
- Quantization: FP16
- Context Length: The model can remember and process up to 128,000 tokens from previous inputs within a single session.
Additionally, we are offering the following distilled models:
- DeepSeek-R1-Distill-Llama-70B
- DeepSeek-R1-Distill-Qwen-32B
- DeepSeek-R1-Distill-Qwen-14B
- DeepSeek-R1-Distill-Llama-8B
- DeepSeek-R1-Distill-Qwen-7B
- DeepSeek-R1-Distill-Qwen-1.5B
Try our token-free service with unlimited usage!
Reach out for access to our dedicated endpoint here.