DeepSeek-OCR: A Novel Open-Source Model with Impressive Capabilities
This article introduces DeepSeek-OCR, a new open-source model from the DeepSeek team. It's not just another OCR tool; it's a revolutionary approach to long text context processing. Traditional large language models face challenges with quadratically increasing computational complexity when handling extensive texts. DeepSeek-OCR tackles this by 'compressing' text content into a two-dimensional image and encoding it into visual Tokens. This significantly reduces Token consumption within the context window, achieving up to a 10x compression ratio while maintaining high recognition accuracy. The article elucidates the collaborative mechanism between DeepSeek-OCR's DeepEncoder and DeepSeek-3B Decoder using AI assistant chat logs. Furthermore, the model draws inspiration from human memory decay and visual perception, implementing a 'digital forgetting curve' that mirrors the gradual fading of information over time, thus offering a fresh perspective on AI memory management.










