As part of our continued efforts to make using Infrascale a pleasant experience that simplifies your backup and disaster recovery lives, we’ve started spring cleaning early this year with the release of Infrascale Disaster Recovery (IDR) v6.13.2.
Again, big thanks to all of our partners and admins that helped report these issues and find resolutions.
Continue reading for a detailed explanation of the release or scroll to the bottom for the list.
For Those Protecting VMware Environments
Issues Running Hourly Backups of VMware
A large piece in the fight against data loss is your restore point objective (RPO), or how frequently your backups are running. More frequent backups means your risk of data loss is reduced, that’s good. However, many partners trying to reduce that risk to a mere hour reported JVM crashes when running in VMWare environments, requiring a manual full backup to run to resolve the issue. We’ve fixed this issue and are happy to say that you can now run hourly backups without concern.
VMware Snapshots Causing Storage Issues
The next item were reports of production storage being consumed by left-over VMware snapshots. Occasionally, our system left these VMware snapshots behind rather than cleaning them up, leaving some rather tedious work for admins. We’ve fixed the automated clean-up of these snapshots so you’ll no longer run into this problem.
Appliance Disconnects from VMware After Reboot
Classic tech-support steps, is it plugged in? Try restarting it. In many cases, we found that primary appliances would not automatically reconnect to VMWare after a reboot, meaning no backups will run. The result would be an influx of monitoring errors that backup jobs were skipped, sending support into a frenzy. We’ve fixed this so reboots no longer require a manual reconnection to VMware and after you update v6.13.2, VMware will automatically reconnect.
Additionally, we improved memory usage during VMware backup, so your backups should perform a bit faster now with fewer memory peaks.
And Now, the Bulk of the Release
Remote Access Goes Dark After a Connection Interruption
You’re working on a recovery, test or real, and suddenly you lose remote access and have to reconnect, queue heart palpitations and expletives. You just need to reconnect, but this costs time, and time is money, especially in a real downtime scenario. We’ve added some logic on our end to prevent this from happening. To reconnect, you have to go to the primary and restart the whole appliance or disable and re-enable remote access. This is a huge problem if you run into this problem and you’re not on site, which is most cases.
When accessing remote VMs after running cloud boot, admins would receive timeouts after on sessions 7 and beyond. We’ve upped the limit of how many remote access windows you can open from the Dashboard at a single time. You’ll still want to keep an eye on performance of your machine as you increase the number of sessions, but now you can launch as many as you like.
“Unknown” Status on Appliance Page
Similar to the remote access fix, those scary instances of “lost” appliances was also resolved. In the case that a connection interruption occurred between the appliance and the cloud infrastructure, admins would simply receive a shoulder shrug from the dashboard–no monitoring data, no usage data, nothing. The fix was to reboot the appliances, We’ve both changed the behavior so that your appliance doesn’t disappear in such a case, and we’ve put in work to ensure that connections are more stable.
Backups Stop with error “VimSDK Error: Bad Parameters of Function”
There were some reports of a “VimSDK error: bad parameters of function” that started to pop-up from the community. We found that the issue was caused when Windows provisions a disk with a partition larger than the disk, causing the backups to fail with the before mentioned error. Our system can now recognize this occurrence and will continue on with backups as before.
Can’t boot inconsistent NTFS Volumes
In the scramble after hard resetting a production server, administrators will often need to run a system utility, ChkDsk, to put the system back into a consistent state. If admins didn’t get the chance before a backup ran, then our system would be unable to boot that or any subsequent version. While we can’t make things nicer on the Windows side, we did add a check before booting to see if it is inconsistent, and, if necessary, we’ll run ChkDsk utility so the boot can perform as expected.
Primary appliance stops working if the secondary is running a different version
For paired-appliance setups that are not replicating to Infrascale’s cloud, there was no automated update on the secondary appliance upon updating the primary. This caused the backups to fail as well as any replications, loading up your ticketing queue with a ton of errors. We’ve now automated the update of the secondary appliance once you’ve updated your primary.
Sluggish Backups after a Firmware Update
In a few cases, we had reports of extremely sluggish backup performance after a firmware update. We found an error that moved a vital catalog off the solid-state-drive (SSD) and onto the primary storage drives. While we can’t automate the fix, we have put in a place a warning telling the administrator to contact Infrascale support so we can dig in and move the catalog back to the right spot.
Unable to Download Files via “Browse and Restore”
The granular file recovery from the cloud appliance didn’t work. This is obviously a super-critical issue and we commend both the reporter and our team for jumping on it ASAP.
Hyper-V Recovery Speed and Bandwidth Improvement
During a backup, we protect only the data that exists and make a note of the empty blocks on each volume. But, during recovery, we were transferring these ’empty’ blocks. Transferring an empty block isn’t so bad, but transferring millions of them could significantly impact recovery time and waste valuable download bandwidth. We’ve changed the behavior to simply no longer send the empty blocks, and provide instructions for the recovery engine to provision as many empty blocks on each volume as when it was protected.
Unnecessary Job Replication
This is another, yet larger, bandwidth saver. If you had an appliance running without replication, then down the line began replicating offsite, your appliance might have been unnecessarily replicating data that would just be removed due to retention settings for the job. To resolve this, we now check the retention settings before each replication event begins, and, if the data is set to be deleted upon arrival, we simply don’t replicate and cancel the job. The replication status will indicate that the job was cancelled due to retention policies.
- FIX: JVM crash during frequent incremental VMWare backups
- FIX: Background cleanup of VMWare snapshots that were left behind
- FIX: Do not replicate jobs that will be deleted remotely due to retention
- FIX: Remote access may become unavailable after interruption of network connection between appliance and cloud infrastructure
- FIX: Unable to open more than 6 remote access windows from Dashboard at the same time
- FIX: Stalled information and “Unknown” status on Appliances page in Dashboard after interruption of network connection between appliance and cloud infrastructure
- FIX: Restore of HyperV VMs only transfers information inside disk image and doesn’t transfer empty blocks
- FIX: Proper DR Image backup of partitions that are outside of disk bounds (“VimSDK error: bad parameters of function”)
- FIX: Primary appliance stops working after some time if secondary is on incompatible version
- FIX: Notification on Appliance UI if Catalog volume is not on SSD
- FIX: Always reconnect to VMware after reboot of appliance
- FIX: Unable to “Browse and Restore” files from cloud appliance
- FIX: Unable to perform boot of Windows machines with inconsistent NTFS
- FIX: Memory leak in JNI during backup of VMWare