This project is read-only.

Run cleanup script at shutdown

Apr 19, 2012 at 3:40 PM

Would it be possible to create a hook for a shutdown event, allowing you to specify scripts for cleanup purposes?

Apr 21, 2012 at 7:40 PM

Can you clarify what you mean by a shutdown event? Do you mean the server being terminated or do you mean the server being rebooted etc... As long as there's a way for the system to notify CloudInit to perform a shutdown event then it could be something I can implement but i'm not really sure what you're asking for here.

Apr 23, 2012 at 11:31 AM

Hi,

We are looking to initiate cloudinit.net to bootstrap a server to configure itself. This configuration includes subscribing to a central server to download configurations, including future updates.

When a server goes down, i.e. a termination but I think this applies to a regular shutdown as well, it will be necessary to unsubscribe from this central server and to clean up a number of resources (such as SQS queues). By having some kind of shutdown hook into cloudinit would allow you to configure decomissioning logic via cloudinit.net. It is a bit similar to the Windows Group Policy shutdown scripts configuration capability (see http://technet.microsoft.com/en-us/library/cc783802(v=ws.10).aspx), but that's only possibly when you're in a domain, which is not always the case.

I realise it will be difficult to make to entirely bullet proof, in the end a server can crash without the possibility to run decommisioning scripts, but it will be very useful to us anyway.

Cheers,

Gerco

Apr 23, 2012 at 6:13 PM

Hi Gerco,

Thanks for explaining your proposal a little more. It's actually been something that I wanted to implement for awhile. At least the bootstrapping portion would be quite interesting because it would make applying updates to CloudInit.NET much easier. I've also run into situations that would require a termination hook as well but the biggest problem I see is that the event cannot be fired from the server running CloudInit. The main reason (I think) is that it wouldn't be very reliable, it's pretty difficult to make sure that the event will fire under all cases.

What I suggest, and something I may eventually implement myself would be a system where the client server would register itself with the centralized server on boot, update itself if required and run the initialization script. The centralized server would be responsible for detecting the server is still online. It could achieve this by querying AWS or actually trying to contact the instance itself. The client would need to send the cleanup script to the centralized server as part of the registration process so that when the centralized server detected a terminated instance it would execute the script and remove it from the registry.

I think its a fairly complicated task to do correctly which is the main reason why I decided not to add it to CloudInit.NET. One other thing you can look at is auto-scaling groups. Even if you don't use auto-scaling groups to automatically scale servers amazon actually provides notifications of new instances that were launched or terminated. I use this to automate nagios for my auto-scaling groups.

Thanks,

Brian

Apr 26, 2012 at 9:19 AM

Hi Brian,

I agree it won't be completely reliable and that also makes it doubtful whether to implement it at all. The subscription model as you described is actually something we will implement. We will use SNS and SQS for the communication and we were looking for a way to clean up the SQS resources when a server terminates.

I think it will boil down to this 'controller' server to keep an eye on things and take action when needed.

Thanks for the good work!

Gerco

Apr 26, 2012 at 3:26 PM

Hi Gerco,

Exactly my point. I think that the most important thing is that the controller can do all of the cleanup work required because there are many things that can cause a server to shutdown and not fire an event. A simple example could be a power outage, hopefully that never happens in EC2 but there are other examples as well. 

Another factor determining how successful this solution could be is how fast you need to react to a terminated server. If you can wait 30 minutes or an hour to clean up then your problem is simple because amazons API allows you to basically ask it anything you want about the servers, but you probably don't want to be asking it once every second I think that's too much to ask especially depending on the size of the network and how much time the cleanup takes.

Good luck sounds like a fun project.

Brian

Apr 27, 2012 at 1:55 PM

Hi Brian,

Just bumped into this: https://s3-eu-west-1.amazonaws.com/cloudformation-templates-eu-west-1/Windows_Roles_And_Features.template

This is based on a Cloudformation-enabled Windows image (https://aws.amazon.com/amis/microsoft-windows-server-2008-r2-base-cloudformation), actually containing (a.o.) similar functionality as being provided by cloudinit.net. I wasn't aware of it, were you?

Cheers,

Gerco

Apr 27, 2012 at 3:20 PM

Hi Gerco,

No I wasn't aware of it actually. I know of cloud formation but never really used it. It looks like it performs slightly the same type of action except i'm not very familiar with cloud formation so i'm not sure if there are any limitations. I still think we have the same problem as before, there's no way to hook into a server being destroyed. I'll have to look into it some more this weekend when I have some free time.

 

Brian

Apr 27, 2012 at 4:20 PM

Cloudformation is actually quite nice. I agree that it won't solve this termination issue, but in general it seems like an alternative to cloudinit.net.

Kr,

Gerco