As always, great questions @jgkim!
So, I would say that the point of analysis is the data that is out there and published, not the one that the government works with. The reason is that we want to see the state of the open data. So if the data that the government published a document in PDF, although we can clearly see it’s an excel, I would still mark it as non-machine readable. Users of the data would not be able to use it if it’s not in machine readable form, and I don’t want to reward government for not making the data fully open to the public, only to themselves. In you case, if the data is only published in HTML, it is not machine readable.
BTW - be very careful with Spending dataset, we are looking for a very detailed dataset, not only high level.