Evaluating progress of LLMs on scientific problem-solving