Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Another technique that I've used with good success is to write a script that dumps out bash commands to delete files individually. I can visually inspect the file, analyze it with other tools, etc and then when I'm happy it's correct just "bash file_full_of_rms.sh" and be confident that it did the right thing.


This was taught to me in my first linux admin job.

I was running commands manually to interact with files and databases, but was quickly shown that even just writing all the commands out, one by one gives room personally review and get a peer review, and also helps with typos. I could ask a colleague "I'm about to run all these commands on the DB, do you see any problem with this?". It also reduces the blame if things go wrong if it managed to pass approval by two engineers.

While I'm thinking back, another little tip I was told was to always put a "#" in front of any command I paste into a terminal. This stops accidentally copying a carriage return and executing the command.


> This stops accidentally copying a carriage return and executing the command.

For a one-liner sure, but a multi line command can still be catastrophic.

Showing the contents of the clipboard in the terminal itself (eg via xclip) or opening an editor and saving the contents to a file are usually better approaches. The latter let’s you craft the entire command in the editor and then run it as a script.


From [0]:

[For Bash] Ctrl + x + Ctrl + e : launch editor defined by $EDITOR to input your command. Useful for multi-line commands.

I have tested this on windows with a MINGW64 bash, it works similarly to how `git commit` works; by creating a new temporary file and detecting* when you close the editor.

[0] https://github.com/onceupon/Bash-Oneliner

* Actually I have no idea how this works; does bash wait for the child process to stop? does it do some posix filesystem magic to detect when the file is "free"? I can't really see other ways


It does create and give a temporary file path to the editor, but then simply waits for the process to exit with a healthy status.

Once that happens, it reads from the temporary file that it created.


The 'enable-bracketed-paste' setting is an easier and more reliable way to deal with that: https://unix.stackexchange.com/a/600641/81005

It will prevent any number of newlines from running the commands if they're pasted instead of typed.

You can enable it either in .inputrc or .bashrc (with `bind 'set enable-bracketed-paste on'`)


That was our SOP for running DELETE SQL commands on production too, a script that generates a .sql that's run manually. It saved out asses a fair amount of times


Yeah, wish I'd learned that the easy way. Fresh into one of my first jobs I was working with a vendor's custom interface to merge/purge duplicate records. It didn't have a good method of record matching on inserts from the customer web interface so a large % of records had duplicates.

Anyway, I selected what I though was a "merge all duplicates" option without previewing results. What I had actually done was "merge all selected". So, the system proceeded to merge a very large % of the database... Into One. Single. Record.

Luckily the vendor kept very good backups, and so I kept my job. Because I also luckily had a very good boss and I had already demonstrated my value in other ways, he just asked me "Well, are you going to make that mistake again?". I wisely said no, and he just smiled and said "Then I think we're done here."

I have been particularly fortunate throughout my career to have very good managers. As much as managers get a lot of flack here on HN, done well they are empowering, not a hindrance, and I attribute a lot of success in my career to them.


> Yeah, wish I'd learned that the easy way.

I think that, if you've only learned something like that the easy way, then you haven't learned it yet. As long as everything's only ever gone right, it's easy to think, I'm in a rush this one time, and I've never really needed those safety procedures before, ….


At a previous job the DB admin mandated that everyone had to write queries that would create a temporary table containing a copy of all the rows that needed to be deleted. This data would be inspected to make sure that it was truly the correct data. Then the data would be deleted from the actual table by doing a delete that joined against the copied table. If for some reason it needed to be restored, the data could be restored from the copy.


I tend to write one script that emits a list of files, and another that takes a list of files as arguments.

It's simple to manually test corner cases, and then when everything is smooth I can just

    script1 | xargs script2
It's also handy if the process gets interrupted in the middle, because running script1 again generates a shorter list the second time, without having to generate the file again.

When I'm trying to get script1 right I can pipe it to a file, and cat the file to work out what the next sed or awk script needs to be.


Ah, I’m glad I’m not the only one who did this. It also means that you can fix things when they break halfway. Say you get an error when the script is processing entry 101 (perhaps it’s running files through ffmpeg). Just fix the error and delete the first 100 lines.


The only issue with that is if subsequent lines implicitly assume that earlier ones executed as expected, e.g. without error.

Over-simplified example:

1. Copy stuff from A to B

2. Delete stuff from A

(Obviously you wouldn't do it like that, but just for illustration purposes.) It's all fine, but (2) assumes that (1) succeeded. If it didn't, maybe no space left, maybe missing permissions on B, whatnot, then (2) should not be executed. In this simple example you could tie them with `&&` or so (or just use an atomic move), but let's say these are many many commands and things are more complex.


At the point you're doing this, you should be using a proper programming language with better defined string handling semantics though. In every place it comes up you'll have access to Python and can call the unlink command directly and much more safely - plus a debugging environment which you can actually step through if you're unsure.


Eh, I think that misses the point a bit. Use whatever you want to generate the output, but make the intermediary structure trivial to inspect and execute. If you're actually taking the destructive actions within your complicated* logic then there's less room to stop, think, and test.

You could always generate an intermediary set, inspect/test/etc, and then apply it with Python. I've done that too, works just as well. The important thing is to separate the planning step from the apply step.

* where "complicated" means more complicated than, for ex, `rm some_path.txt` or `DELETE FROM table WHERE id = 123`.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: