In vonbrand’s comment to How do I scan for invalid characters on gedit?
A quick and dirty way would be to edit the file (probably adding and
deleting something will be enough to make gedit think it was changed),
and save as another name. Comparing the original and the changed file
(diff should help here) should tell you what is going on
Why can it help to identify which bad code values are in a text file wrt the encoding method with which gedit tried to open the file and reported invalid characters?
when adding and deleting something to the text and saving it to another file in gedit, will the text be changed in what way?
Will the text be converted to a single encoding method, if the text has a mixture of multiple encodings?