IaaS Automated Powersaving, Green Sustainability - Pt.2

Up and running with powersaving
The goal and purpose of this article is to show an example of implementation to accomplish the scheduled Powering off(and on) of VMs in our own On-Premises datacenter the same way Google GCP, AWS, and Azure VMs does it with their Hyperscalers.
-
This article is divided into 3 parts:
This is the second article containing the practical walk through of an example of how to do this with VMware technology.
Note: If you missed the Hyperscaler discussion and how Hyperscalers actually power off and power on VMs on a schedule, and also “the WHY”
, then please go ahead and read up on the article IaaS Automated Powersaving, Green Sustainability - Pt.1
.
.
An example: how you can do it
Since you are aware of costs of your Infrastructure, you are probably using more than one technology from VMware.
In this article I have made a suggestion using Orchestrator, Python, and Aria (Aria) Automation to accomplish a simple way to implement Power Off and Power On schedules for you Datacenter.
Requirements
To go through with this you need Aria Automation. Aria Automation contains Aria Orchestrator and Aria Service Broker that we show in this context. Aria Automation also contains SaltStack Config but we’re not using that this time.
A little Python knowledge is OK, but not necessary, because we are providing you with the scripts you need.
Using the Self Service portal
Requesting a deployment with power save
To simplify the consumption of IT services for users by using a Self-Service Provisioning
in a portal. In this Multi-Cloud Management service catalog, we as an IT team have predefined our service offering. Here I have a Catalog Item called Save Power, that will deploy virtual machines and tag them with the tag “powersave”.
Request
When we click Request I am presented with a Request
form with a possibility to change Power Save and VM Size.
Both of which will affect our savings:
By clicking the information icons on those two options, we can get more information about the different options.
Note that We’ve chose to keep it simple with enforced power off at 18:00 and power on at 06:00, this could of course be customizable, but there are multiple reasons to keep it simple.
Powersave mode
Below is the explanation for the Size explains the impact of the Power Save mode
The deployment Size
Below is the explanation for the Size explains the Sizes of the servers we’re about to deploy:
Price
Since our Automaton System (Aria Automation) is has the ability to use pricecards or be connected to the Operations system (Aria Operations) with price cards we can also Calculate the monthly price for the several options e.g. when we choose a X-large or a small server.
Slack Notification
The end result is that every morning and every evening there is a Power On and Off of the servers tagged with powersave = true. There is a slack notification each time:
See further down for explanation about the Slack portion of the Python script that makes this happen.
**Behind the Scenes **
Aria Automation Cloud Template
The blueprint aka. Cloud template: Behind the Self Service choice there is a simple Cloud Template, in other words a declarative language, such as YAML containing Infrastructure as Code (IaC) to define our desired state of our cloud infrastructure.
You can find a Github IaC YAML code with a copy of the template here
The tagging: The main thing about the Cloud template and the VM you are about to create is that is has a specific TAG. The Tag is created with this code snippet within the cloud template:
tags:
- key: os
value: windows
- key: powersave
value: ${input.powersave}
vSphere Tags: in vSphere in the vCenter, this tagging is reflected
The Orchestrator workflow and Schedule
To make sure machines marked (tagged) with the powersave = true
tag will run as scheduled. We have created two Scheduled Workflows. One for 06:00:00 in the morning that will power on VMs, and One at 18:00:00 (6pm) that will power off VMs. Both of these two scheduled workflows calls the workflow named “bgro-powersave-schedule”.
Here is an example of the scheduled task
Python code / Orchestrator Workflow
Get the code
GET YOUR copy of the code from THIS GITHUB PAGE
We just use a single Workflow called “bgro-powersave-schedule”. In that workflow, we have one scriptable task with a Python script that actually contains the most of the intelligence. It finds all deployments with the power save tag set to true then powers on or off accordingly.
The Python script behind the scriptable task in the workflow “bgro-powersave-schedule” in Aria Orchestrator is made so it can manage the power on/off for VMs by powering them on or off based on a set time window.
Python
The Python script uses the Aria Automation (vRA) API to control the machines and also has a Slack web-hook to send notifications when machines are powered off or on.
Functions
The script got these functions:
- ``power_off_resources(resource_ids, inputs, bearer_token)`: Powers off resources given their IDs.
power_on_resources(resource_ids, inputs, bearer_token)
: Powers on resources given their IDs.get_resource_ids_with_powersave_tag(bearer_token, inputs)
: Retrieves resource IDs with the "powersave" tag.vraauth(inputs)
: Authenticates with the vRA API (returns a bearer token)send_to_slack(message, inputs)
: Sends message to Slack.
The intelligence is of course the power_off_resources
and power_on_resources
functions will loop through the provided resource IDs and power them off or on using the vRA API.
Function to power on resources
1# Function to power on resources
2def power_on_resources(resource_ids, inputs, bearer_token):
3 # vRA API URL
4 url = inputs["vra_url"]
5 # vRA API headers with bearer token
6 vraheaders = {
7 "accept": "application/json",
8 "content-type": "application/json",
9 "Authorization": "Bearer " + bearer_token
10 }
11 # Loop through each resource ID and power it on
12 with requests.Session() as session:
13 for resource_id in resource_ids:
14 # vRA API payload to power on the resource
15 payload = {
16 "actionId": "Cloud.vSphere.Machine.PowerOn",
17 "inputs": {},
18 "reason": "Power On"
19 }
20 # Send the power on request to vRA using the requests library
21 resp = session.post(f"{url}/deployment/api/resources/{resource_id}/requests", headers=vraheaders, json=payload, verify=False)
22 try:
23 # Raise an error if the response status code is not 200 OK
24 resp.raise_for_status()
25 # Send a message to Slack to inform that the resource is being powered on
26 send_to_slack(f"POWERSAVE: Power on successfully called for resource ID: {resource_id}", inputs)
27 except requests.exceptions.HTTPError as err:
28 # If the status code is 400, log the error and continue to the next resource
29 if err.response.status_code == 400:
30 print(f"Power on failed for resource ID {resource_id}: {err}. Is it already powered on?", inputs)
31 else:
32 # If the status code is not 400, raise the error
33 raise
34
Function to power off resources
1
2# Function to power off resources
3def power_off_resources(resource_ids, inputs, bearer_token):
4 # vRA API URL
5 url = inputs["vra_url"]
6 # vRA API headers with bearer token
7 vraheaders = {
8 "accept": "application/json",
9 "content-type": "application/json",
10 "Authorization": "Bearer " + bearer_token
11 }
12 # Loop through each resource ID and power it off
13 with requests.Session() as session:
14 for resource_id in resource_ids:
15 # vRA API payload to power off the resource
16 payload = {
17 "actionId": "Cloud.vSphere.Machine.Shutdown",
18 "inputs": {},
19 "reason": "Power Off"
20 }
21 # Send the power off request to vRA using the requests library
22 resp = session.post(f"{url}/deployment/api/resources/{resource_id}/requests", headers=vraheaders, json=payload, verify=False)
23 try:
24 # Raise an error if the response status code is not 200 OK
25 resp.raise_for_status()
26 # Send a message to Slack to inform that the resource is being powered off
27 send_to_slack(f"POWERSAVE: Power off successfully called for resource ID: {resource_id}", inputs)
28 except requests.exceptions.HTTPError as err:
29 # If the status code is 400, log the error and continue to the next resource
30 if err.response.status_code == 400:
31 print(f"Power off failed for resource ID {resource_id}: {err}. Is it already powered off?", inputs)
32 else:
33 # If the status code is not 400, raise the error
34 raise
35
The send_to_slack function
1# Function to send a message to a Slack channel
2
3def send_to_slack(message, inputs):
4
5Slack webhook URL
6
7webhook_url = inputs["slack_webhook_url"]
8
9# Slack message payload
10payload = {
11 "text": message
12}
13
14# Send the message to Slack using the requests library
15response = requests.post(
16 webhook_url, data=json.dumps(payload),
17 headers={'Content-Type': 'application/json'}
18)
19
20# Raise an error if the response status code is not 200 OK
21if response.status_code != 200:
22 raise ValueError(
23 f'Request to Slack returned an error {response.status_code}, the response is:\n{response.text}'
24 )
Conclusion
If you download everything needed from the Git Repository, as mentioned [Above](#1-Get the code) , the rest of the code is fairly well documented within the code. Pay attention to what it does.