NVIDIA Unveils Open Source Giant Nemotron-4 340B, Trained with 98% Synthetic Data to Create the Strongest Open Source General Model!
NVIDIA introduces Nemotron-4 340B, an open-source model that could revolutionize the way LLMs are trained. Using synthetic data, it surpasses Mixtral 8x22B, Claude sonnet, Llama3 70B, and Qwen 2, even competing with GPT-4. The model includes Base, Instruct, and Reward components and supports 4K context window, 50+ languages, and 40+ programming languages. Notably, 98% of the Instruct model training uses synthetic data. It shows strong performance in common sense reasoning tasks and surpasses GPT-4o-0513 and Gemini 1.5 Pro-0514 in RewardBench accuracy. The model is optimized for commercial use with friendly licensing and can be fine-tuned with NVIDIA NeMo and TensorRT-LLM. Its potential impact spans from healthcare to finance and beyond, but raises concerns about data privacy and ethics.
