At Bastian Solutions, we continually strive to provide superb customer service at the highest standard, and not only reactive but proactive support. Transparency is a key component in helping achieve this objective and with that in mind, we are providing a full root cause analysis for the disruptions in service on March 14th and 16th, 2020 described below.
Summary:
Wednesday, June 4th
10:49 PM EST Site Called in reporting an issue with Cartons Not releasing. Reported that Line 1 and Line 2 were not released and that Cartons were not making it to the print and apply
11:05 PM EST T1 attempted to connect to the site, was unable to connect to the VPN, due to issue with the account [need more info]
11:30 PM EST The site called in to report that Lines 3 and 7 were seeing the same issue.
11:35 PM EST The site reported that they were unable to connect to the VPN as well. The site was reaching out to their IT for assistance with the VPN
Thursday, June 5th
5:45 AM EST The Site reached back out to T1, to get more info on the VPN and the user account being used.
6:26 AM EST The site Called back saying the account for the VPN had been reset. T1 was able to connect to the site
6:45 AM EST T1 was connected and Troubleshooting the issue. The Discovered an issue with a photo-eye on Lane 2. T1 engaged our Controls team for them to take a look at the issue.
9:32 AM EST The site called back in and reported that the Print and Apply printers were not printing labels.
9:54 AM EST Bridge call was started
10:00 AM EST T1 worked with the site to troubleshoot AOR and to get connected to the Application Server. Phillip noted several Errors on PandA and reached out to the Controls team to assist.
10:15 AM EST Errors were seen with the PandA but Controls verified everything looked fine. DHL team sent in a request to get support account unlocked so we could access the app server (account had been disabled).
10:30 AM EST Phillip and DHL team decided the best method to get onto the App server was through Go To Meeting for the time being and logging in under Bill's account.
10:35 AM EST After getting logged in it was noted that we were receiving no communication in socket logs and that we were unable to ping AOR for lines 1 and 2 on the server. Controls noted this as well with PLC machine for lines 1 and 2.
10:45 AM EST T1 escalated to T2
11:00 AM EST Upon Troubleshooting the import logs in exacta it was noticed that only the Heartbeat message was being sent from the host system and that no orders/label data had been sent since the previous night. Preston and Phillip worked to query database for waves host system had recently sent over and noticed they were not in system. Import services were restarted.
11:05 AM EST
Bastian Team advised DHL team to loop in host system for import errors and networking team for inability to reach AOR for lines 1 and 2.
12:45 PM EST T1 transitioned to T2
1:00 PM EST Site discovered an issue with a ZPL file that they had sent down the previous night, The export to exacta was canceled (by the host). At this point, all of the labels that were in the queue of pending exports where sent and imported into exacta.
Once all of the data was sent to exacta. The operations side attempted to print labels. Labels failed to print. I waited on the site to verify they were following the correct workflow for printing.
2:00 PM EST T2 reached out to Development.
2:10 PM EST T2 Dev was engaged and being brought up to speed. Began Troubleshooting the issue, Dev asked for Logs from the AOR, due to restrictions with their network there as some delay in getting the Logs back to dev.
3:00 PM EST The site mentioned some password changes that would affect several service accounts. I began troubleshooting that. Verifying that the exacta services were not affected. The site was also looking at .pdf printing. Those Print types were the same labels that we were having issues with.
Root Cause:
Site reported that there is a known issue that when bad ZPL data is sent from the host, any further label import will not be sent. Once the bad ZPL was canceled, the host system was able to resume sending data.
Resolution:
Once The Bad ZPL was canceled, and the exacta services were restarted. The host was able to send down new waves, and reprint the labels that were currently being worked.
Next Steps and Preventive Actions:
The site mentioned they had an alert set up to report on the bad ZPL. Bastion was not able to get connected and troubleshoot the issue due to issues with the VPN account being locked out. A request has been put in for Multiple VPN accounts in an effort to prevent this issue. Further documentation is being put in place to assist the T1 in catching the issue. in a more timely manner.
Our commitment to you:
Bastian Solutions understands the impact of the disruption that occurred and affected operations for your organization. It is our primary objective in providing our clients with superb customer service and we assure you we're taking the required preventative measures to prevent reoccurrence.
|
Comments
0 comments