CIRCLE WITH A DOT

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

W

Hermes can be used with the DeepSeek V4 Pro model
Watching Ignoring Scheduled Pinned Locked Moved Uncategorized hermes deepseek research autoresearch karpathy
2

0 Votes

2 Posts

0 Views

W

https://hermes-agent.nousresearch.com/docs/user-guide/features/rl-trainingHermes is an evolutionary approach. Compared the DSPy agent model, which is programmatic / declarative. DSPy for many of the use cases, is a lot of work (fine-tuning, debugging). DSPy excels at finance tasks or data analytics where you know all the details. But that is phase 2. Phase 1 is getting these details, and getting them fast. That is imho where Hermes comes in. I can see myself using Hermes for * Content tooling: auto-briefing, strategic foresight assessments.* Intelligence analysis: method approaches to trend analysis or signals* Individualization of content. OSINT, social media scraping, trend analysis.* Data sensing, estimation on datasets, start simple* Autoresearch (later), for example, for optimization solutions or performance debugging#dspy #hermes #declarative #evolutionary #ai
H

Geistiges US-Eigentum gestohlen: US-Botschaften sollen vor DeepSeek & Co. warnen
Watching Ignoring Scheduled Pinned Locked Moved Uncategorized anthropic deepseek kunstlicheintel wirtschaft
1

0 Votes

1 Posts

4 Views

H

Geistiges US-Eigentum gestohlen: US-Botschaften sollen vor DeepSeek & Co. warnenDie US-Regierung macht sich Kritik von KI-Firmen wie Anthropic an der Konkurrenz aus China zu eigen. US-Botschaften sollen vor der Nutzung von DeepSeek warnen.https://www.heise.de/news/Geistiges-US-Eigentum-gestohlen-US-Botschaften-sollen-vor-DeepSeek-Co-warnen-11272583.html?wt_mc=sm.red.ho.mastodon.mastodon.md_beitraege.md_beitraege&utm_source=mastodon#Anthropic #DeepSeek #IT #KünstlicheIntelligenz #Wirtschaft #news
A

RT @bookwormengr: DeepSeek V4 hits it out of the park and addresses HBM shortage: DeepSeek proves why it is such a fundamental research lab.
Watching Ignoring Scheduled Pinned Locked Moved Uncategorized deepseek nitter arintinfo
1

0 Votes

1 Posts

0 Views

A

RT @bookwormengr: DeepSeek V4 hits it out of the park and addresses HBM shortage: DeepSeek proves why it is such a fundamental research lab. In addition to exceeding Opus 4.6 on Terminal Bench and virtually matching on other performance metrics, the most notable advancement is this statement: "In the 1M-token context setting, DeepSeek-V4-Pro requires only 27% of single-token inference FLOPs and 10% of KV cache compared with DeepSeek-V3.2" To understand significance of this point, consider below diagram that shows memory layout for Prefill and Decode nodes. If you implement Decode with Data and Expert parallelism (DEP16) with 16 GPUs on GB200 or GB300 NVL72 rack with DeepSeek v3.2, you are left with 104GB or 176 GB HBRAM per GPU respectively. Here we are assuming MoE parameters are in NVFP4. The remaining HBRAM per GPU dictates how large batch size you can have for inference, which determines how many concurrent request you can serve. Consider GB300 with 176GB left: 1. For 128K context, you need 4.45 GB HBRam for KV Cache, and you can serve only 36 concurrent requests. 2. For 256K context, you need 8.90 GB HBRam for KV Cache, and you can serve only 18 concurrent requests. 3. For 512K context, you need 17.80 GB HBRam for KV Cache, and you can serve only 9 concurrent requests. 4. For 1M context, you need 35.60 GB HBRam for KV Cache, and you can serve only 4 concurrent requests. You see the point. Now you imagine, you actually required 10 times less KV cache somehow at 1M! It basically enables you to server 10 times more requests with same resources. Recall Decode is memory bound and not compute… mehr auf Arint.info #DeepSeek #nitter #arint_info https://x.com/bookwormengr/status/2047527303824236545#m
A

RT @deepseek_ai: 🔥Die DeepSeek-V4-Pro-API ist bis zum 5.
Watching Ignoring Scheduled Pinned Locked Moved Uncategorized api deepseek integration kunstlicheintel rabatt
1

0 Votes

1 Posts

0 Views

A

RT @deepseek_ai: Die DeepSeek-V4-Pro-API ist bis zum 5. Mai 2026, 15:59 Uhr (UTC) um 75% reduziert! Verpassen Sie nicht diesen massiven Rabatt. mehr auf Arint.info #API #DeepSeek #Integration #KünstlicheIntelligenz #Rabatt #SoftwareUpdate #arint_info https://x.com/deepseek_ai/status/2048062777357750316#m
A

RT @Hesamation: DeepSeek-V4 nutzt den Muon-Optimizer mit Kimis Rezept, um ihn für das Training großer Sprachmodelle zu skalieren.
Watching Ignoring Scheduled Pinned Locked Moved Uncategorized deepseek kimi llm machinelearning
1

0 Votes

1 Posts

2 Views

A

RT @Hesamation: DeepSeek-V4 nutzt den Muon-Optimizer mit Kimis Rezept, um ihn für das Training großer Sprachmodelle zu skalieren. In der Zwischenzeit verwendet Kimi K2 (und K2.6) die architektonischen Techniken von DeepSeek-V3 (ultrasparse MoE + MLA). Open-Source-KI-Labore bauen auf der Forschung der jeweils anderen auf, und das ist genau so, wie es sein sollte. mehr auf Arint.info #DeepSeek #KI #Kimi #LLM #MachineLearning #OpenSource #arint_info https://x.com/Hesamation/status/2047681313226854838#m
A

RT @teortaxesTex: TRANSLASATION: DeepSeek mit einer Kontextlänge von 1M – was wir als V4 [-lite] ansehen und was auf Web/App als -Instant bereitgestellt wird – ist über die offizielle API verfügbar.
Watching Ignoring Scheduled Pinned Locked Moved Uncategorized api deepseek machinelearning technews
1

0 Votes

1 Posts

7 Views

A

RT @teortaxesTex: TRANSLASATION: DeepSeek mit einer Kontextlänge von 1M – was wir als V4 [-lite] ansehen und was auf Web/App als -Instant bereitgestellt wird – ist über die offizielle API verfügbar. Die API-Dokumentation wurde noch nicht aktualisiert. Sie rollen die nächste Generation aus. 永雏塔菲 (@xhyctf) hat die offizielle API von ds aktualisiert — https://nitter.net/xhyctf/status/2046855733288067266#m mehr auf Arint.info #AI #API #DeepSeek #MachineLearning #TechNews #arint_info https://x.com/teortaxesTex/status/2046858634789875978#m

CIRCLE WITH A DOT

Hermes can be used with the DeepSeek V4 Pro model

Geistiges US-Eigentum gestohlen: US-Botschaften sollen vor DeepSeek &amp; Co. warnen

RT @bookwormengr: DeepSeek V4 hits it out of the park and addresses HBM shortage: DeepSeek proves why it is such a fundamental research lab.

RT @deepseek_ai: 🔥Die DeepSeek-V4-Pro-API ist bis zum 5.

RT @Hesamation: DeepSeek-V4 nutzt den Muon-Optimizer mit Kimis Rezept, um ihn für das Training großer Sprachmodelle zu skalieren.

RT @teortaxesTex: TRANSLASATION: DeepSeek mit einer Kontextlänge von 1M – was wir als V4 [-lite] ansehen und was auf Web/App als -Instant bereitgestellt wird – ist über die offizielle API verfügbar.

Geistiges US-Eigentum gestohlen: US-Botschaften sollen vor DeepSeek & Co. warnen