1 Comment

SpreadSheetLLM: Microsoft's SOTA for handling Excel with large language models.

This paper has no open-source code, no open-source model, and requires fine-tuning, but it has achieved the current SOTA level for processing Excel.

My opinion:

The path of large language models is practical. Using CNNs would require designing unique network structures, which is cumbersome.

The methods in the paper are not complex and can be implemented with a bit of technical background. With careful design, it could be made into an agent, potentially increasing accuracy.

Fine-tuning through table recognition can easily transfer to QA tasks, greatly reducing data collection difficulties. Friends in need can try training it.

Reversed encoding is not present in many LLM training corpora, making fine-tuning inevitable.

Expand full comment