Large Language Models (LLMs) are exceptionally resource-intensive on the CPU and memory, but Apple is said to be experimenting with storing this technology on flash storage, likely to make it easily accessible on multiple devices. However, the technology giant also wishes to make LLMs ubiquitous on its iPhone and Mac lineup and is exploring ways how to make this possible.
Storing LLMs on flash memory has been difficult; Apple aims to fix this on machines with limited capacity
Under typical conditions, Large Language Models require AI accelerators and a high DRAM capacity to be stored. As reported by TechPowerUp, Apple is working to bring the same technology, but to devices that sport limited memory capacity. In a newly published paper, Apple has published a paper that aims to bring LLMs to devices with limited memory capacity. iPhones have limited memory too, so Apple researchers have developed a technique that uses flash chips to store the AI model’s data.
Since flash memory is available in abundance on Apple’s iPhones and Mac computers, there is a way to bypass this limitation with a technique called Windowing. In this method, the AI model reuses some of the data that it has already processed, reducing the requirement for continuous memory fetching and making the entire process faster. The second technique is Row-Column Bundling; data can be grouped more efficiently, allowing the AI model to read data from the flash memory faster and speeding up its ability to understand.
Both techniques and other methods will allow the AI models to run up to twice the size of the iPhone’s available RAM, resulting in up to 5 times increase in speed on standard processors and up to 25 times faster on graphics processors. There is plenty of evidence suggesting that Apple is serious about AI, starting with its own chatbot, which is internally called Apple GPT. Next year’s iPhone 16 is also rumored to feature upgraded microphones, which will be costly for the company but will allow for improved speech input, which will be necessary to enable Siri to carry out a multitude of tasks.
It is also rumored that some form of generative AI will be baked into iOS 18 when it officially arrives next year, so while Apple has been behind the likes of OpenAI, Google, Amazon, and others, that gap might considerably decrease in 2024. Additionally, bringing this technology to iPhones, iPads, and Macs with limited memory could give these devices a unique selling point, but we will have to see how LLMs stored on flash memory perform first.