I was thinking something similar. I wonder if the font uses kerning, and you know the rendering engine and the algorithm for how the text was blocked, if you can get exact text back even. Or, at a minimum, rule out words based on the available information. Not a field I am familiar with but I bet there are a lot of ways to uncover the redacted values.
This is the government. The documents are faxed/photo-copied/etc etc. They are a bunch of random docs from random sources and the original creators never thought 'This will be redacted'. They just fired up word and started typing.
In some redacted documents, there is even an alphabetical word index at the end with a list of pages on which the words appear.
The redacted words are also redacted in the word index, but the alphabetically preceding and succeeding words are visible, as is the number of index lines taken up by the redacted word's entry, which correlates with the number of appearances of that word.
This seems like rather useful information to constrain a search by such a tool.
Does it even matter? The kind of people who see stuff like this and are still fine with it are likely fine with anything else thats discovered as well.
this tool coming out on the heels of the DOJ releasing a trove of redacted documents doesn't come across as coincidental to me. let's think about this for a bit longer from that idea of using this on legal evidence...why would doctoring a legal document be prohibited?
Generally there is nothing illegal about altering a legal document, or even a strict definition of what counts as a legal document. Under some circumstances it could be illegal to alter a document and use that for fraud, or submit an altered document to a court or government agency. If the doctoring falsely defames someone then you could also open yourself up to a civil suit.
Perhaps I misunderstand what "sue" includes in US jurisdictions but prohibition in this context ought to be criminalisation, i.e. something that happens in the relation between the individual and the state, and to me 'suing' is something that happens in a relation between individuals.
For all we know, Epstein could have punished Trump and made him write "I'm a little bitch boy" 2,000 times and it took up 119 pages so every line got redacted. /madlibs
Because to me it seems like altering and disseminating a document would be under 1st amendment protection, unless combined with some action that e.g. causes someone else harm or tricks the state into doing something it should not do or something.
I guess you mean offical legal documents or something, but your sentence doesn't say that or mention those so it comes across in a very confusing way (it implies that using Word is illegal because every time you type something you alter your document)
Of course not illegal. When filled out with the official unredaction font [0], time stamped by the Ministry of Information, and delivered in triplicate, personally to Interrogation within 46 hours.
Yes, this is at best a project for trolling, and it is getting voted on because people naively think it has some useful applications regarding the Epstein documents. It does not.
why unredact, rather than just edit the pdf to remove the redaction box and insert whatever you want? presumably you'd want a viewer to see that you modified a redaction, but why?
The point is you can perform a box dimension attack.
If you have a known input, you can match all outputs.
Example: Document that DOJ took down and reuploaded that redacted Trump's name when it was previously available. They used the same size boxes in each location.
You cannot do this with handwriting, but fonts have known widths.
I see another similar comment, but I have an explicit question. Does the following from the README hold any water at all, legally?
> I am not responsible for your use of this tool. ... By using this tool you claim all legal liability for any documents you create with it.
Without a detailed and carefully worded license, does this confer any protection whatsoever?
Having asked that, I'm not sure what protection would be needed. Could a victim of abuse of this tool (or similar) seek some sort of take-down of the tool? It seems unlikely but I'm curious about the scenario.
Waterluvian | 23 hours ago
jmward01 | 22 hours ago
amarant | 22 hours ago
Seems silly not to use a mono space font in these cases.
sa46 | 22 hours ago
jstanley | 21 hours ago
[OP] kvthweatt | 21 hours ago
jmward01 | 19 hours ago
estimator7292 | 22 hours ago
DavidSJ | 22 hours ago
The redacted words are also redacted in the word index, but the alphabetically preceding and succeeding words are visible, as is the number of index lines taken up by the redacted word's entry, which correlates with the number of appearances of that word.
This seems like rather useful information to constrain a search by such a tool.
[OP] kvthweatt | 21 hours ago
mapontosevenths | 22 hours ago
The truth has become irrelevant.
https://www.justice.gov/epstein/files/DataSet%208/EFTA000250...
[OP] kvthweatt | 21 hours ago
dylan604 | 21 hours ago
8note | 22 hours ago
what exactly does this mean? misrepresenting the altered document as unaltered?
i cant imagine it being illegal to do madlibs
[OP] kvthweatt | 21 hours ago
It must be accurate. Even that being said, you still shouldn't reupload your altered document anywhere.
cess11 | 21 hours ago
dylan604 | 21 hours ago
nradov | 20 hours ago
dylan604 | 15 hours ago
nradov | 14 hours ago
cess11 | 2 hours ago
[OP] kvthweatt | 21 hours ago
Standard CYA procedure
For all we know, Epstein could have punished Trump and made him write "I'm a little bitch boy" 2,000 times and it took up 119 pages so every line got redacted. /madlibs
cess11 | 2 hours ago
Because to me it seems like altering and disseminating a document would be under 1st amendment protection, unless combined with some action that e.g. causes someone else harm or tricks the state into doing something it should not do or something.
[OP] kvthweatt | an hour ago
The CYA is just me saying I'm not responsible for anything anyone makes, because anyone can make a document say anything with this tool.
circuit10 | 21 hours ago
rexpop | 20 hours ago
vessenes | 20 hours ago
Nevermark | 20 hours ago
Of course not illegal. When filled out with the official unredaction font [0], time stamped by the Ministry of Information, and delivered in triplicate, personally to Interrogation within 46 hours.
[0] https://en.wikipedia.org/wiki/San_Francisco_(decorative_type...
NewsaHackO | 19 hours ago
[OP] kvthweatt | 19 hours ago
typeofhuman | 22 hours ago
This doesn't remove redactions, it lets you write over them.
dundarious | 19 hours ago
This is trash, IMO.
[OP] kvthweatt | 18 hours ago
Added images to show the tool in action.
dundarious | 12 hours ago
websiteapi | 22 hours ago
speedgoose | 22 hours ago
dylan604 | 21 hours ago
[OP] kvthweatt | 21 hours ago
If you have a known input, you can match all outputs.
Example: Document that DOJ took down and reuploaded that redacted Trump's name when it was previously available. They used the same size boxes in each location.
You cannot do this with handwriting, but fonts have known widths.
cortesoft | 21 hours ago
dylan604 | 21 hours ago
fn-mote | 21 hours ago
You'd never be blase about the same information about your password.
Plus with redaction there's a pretty small number of posible words when the boxes are small.
BLKNSLVR | 20 hours ago
jaredwiener | 21 hours ago
yellow_lead | 21 hours ago
For instance, this file says Mona if you remove the top layer https://www.justice.gov/epstein/files/DataSet%208/EFTA000136...
Some others I've seen include 1-3 more letters than are in the redaction.
brikym | 21 hours ago
Redactle.net has something similar where you can double-click or tap-hold then type a note over the redacted word.
austinjp | 19 hours ago
> I am not responsible for your use of this tool. ... By using this tool you claim all legal liability for any documents you create with it.
Without a detailed and carefully worded license, does this confer any protection whatsoever?
Having asked that, I'm not sure what protection would be needed. Could a victim of abuse of this tool (or similar) seek some sort of take-down of the tool? It seems unlikely but I'm curious about the scenario.
[OP] kvthweatt | 19 hours ago
[OP] kvthweatt | 19 hours ago
The redactions by DOJ are so sloppy that you can COPY AND PASTE blocks of text to a new text editor and see the "redacted" text beneath.
Try it yourself.
They did not properly redact many documents.
It's about to get wild.
[OP] kvthweatt | 18 hours ago
It works now.
[OP] kvthweatt | 55 minutes ago