We performed a manual evaluation with medical experts for all the responses to 7 selected questions on the same 4 aspects. Results: Regarding the similarity of the responses from 4 LLMs; the GPT-4 ...