Announcement

Collapse
No announcement yet.

Revit Server - Multiple file locks following unplanned network outage

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Revit Server - Multiple file locks following unplanned network outage

    We had an incident where a Host server was unexpectedly taken offline. I’ve summarised the steps I followed once the service was restored as some of the files appear to have been corrupted. I’d be interested to learn what else I could have done (or what I did wrong!). Our network comprises one host and two accelerators.

    Initial Diagnosis: Revit Server was applying locks to the project files that prevented users making changes. The locks where not visible using the Revit Server Administrator tool. Files where effectively opening in Read Only mode and Save As new was not a viable workaround (as the application prevented it).

    Following a review of the user journals it looked like several users where syncing their models at the time of the outage. The steps I followed were:
    1. Attempted to trip the system by applying locks to the files using the Revit Server Administrator site.
      • This failed as the Host reported locks where already in place.

    2. Attempted to release the locks by removing them manually by accessing the Host
      • For this I accessed the server and used the Revit Server Command Line Utility to detect the locks. The tool failed to detect the locks which implied the locks where malformed or corruption of the contro11ing database. I also checked the health of the lock database which the tool reported as healthy.

    3. Selectively and manually removed the element and permission level locks.
      • On inspection of the project folders using Windows Explorer lock files where present on the server though undetected by the command tool. I reviewed the contents of the lock files and worked my way through removing one lock and one user at a time. This failed as the server generated a new lock when the file was opened (even for users who had never accessed the file before).

    4. Cleared the Accelerator caches.
      • In order to rule out the prospect of the locks being generated on synchronisation with the host I cleared the cache servers. I then repeated Steps 1 to 3.

    5. Attempted to create a backup copy of the files using Revit Server Administrator.
      • Copying to another location on the Host, failed as Revit Server reported locks were already in place that prevented copying of the files.
      • Copying to another Host worked (I created a new Host Server), initially, but failed when an attempt was made to open the files for editing. It appears that the locks where copied across to the new Host. The copy had a new GUID which led me to suspect data corruption was occurring

    6. Created a back-up copy of files using Windows Explorer.
      • The backup was not accessible either through Revit or Revit Server
        1. The server in question is not configured to allow access to our wider network so I had no way of copying the files off for opening in Revit (well not without more work to open connections etc.). I stopped short of installing Revit on the Server.
        2. I understand the service works by “importing” the files into Revit Server from Revit. When they are “imported” they are assigned a database GUID and it is this which makes the files visible in Revit Server. As my copy was not an “import” Revit Server reported the folders but not the contents.

    7. Attempted to reassign a new GUID to the back-up copy.
      • I created a new file in Revit and added it to the Host. I then copied the files (from the Data folder but not the model.rvt file) from the backup to the Data folder of the new file. This generated the same lock issues when the new file was opened. This implied that the error resided within the actual data.

    8. Deleted the files from the Host and restored local user backups.
      • The restored files are usable and the users are now catching up on their lost work. The files have been given the same names so that existing links within the files will not need restoring. The new files have new GUIDs and the Host is not applying corrupted locks.

    Revised Diagnosis: Corrupted files on the Host server. Most likely to be a result of the server being taken offline during a sync session. Not all of the project files are affected, just those which were being synchronised around the time of the incident. The other files were unaffected, even though they were ‘checked out’ at the time.

    This incident could arise again if the Host where to go offline, say due to (another) network failure, whilst a synchronisation was taking place. As I mentioned at the beginning, I’d be interested to learn what else I could have done (or what I did wrong!).

    #2
    I appreciate your thoroughness. Unfortunately, Revit Server is not robust given server, network or power outages. Any disruption during a file synchronization or file open may result in a corruption in the elements permissions table.

    We do nightly server backups of the entire C:\ProgramData\Autodesk\Revit Server\Projects\ folder. It takes about 30 minutes to xcopy to 20+ gb of data to a separate file server. This is our safety net for server failure.

    Comment


      #3
      Thanks for the feedback Markus.

      Our Host server is in near constant use the projects on it span continents (it serves offices in the Far East (via accelerator), Middle East (host) and Europe (via accelerator) so I think we may resort to hourly Netapp snapshots as some of the links are particularly shaky. However I’m not sure whether Revit Server will tolerate being backed up on the fly. Apart from the Projects folder do you recommend backing up the logs – they are growing at an alarming rate – we have 10’s of GBs of log files…

      Comment


        #4
        We also have offices across the globe, and identify an hour window at 8:00pm PST for the Revit Server backup. We create a Windows task to run the DOS script, which calls the Revit Server tool. We have two scripts, since we keep a two week backup available. We have two corresponding DOS scripts to delete the folders after a week passes.

        Unfortunately, if someone is in the Revit Server model during the backup, it can disrupt the backup (either incomplete or corrupt the model). Likewise, those users may complain that the server is "unavailable" during that time. We just remind them to save locally, and then synchronize at 9:00pm PST.

        [email protected] off
        Rem
        Rem Lock Server Prior to Backup
        cd C:\Program Files\Autodesk\Revit Server 2012\Tools\RevitServerCommand
        RevitServer lock -s
        Rem
        Rem
        Rem
        Rem
        Rem xxxxx - XCopy all data
        cd c:\windows
        XCOPY "C:\ProgramData\Autodesk\Revit Server\Projects" "\\%Fileserver%\%computername%\Week1\%date:/=%\Projects" /D/E/C/I/Q/H/K/Y
        Rem
        Rem
        Rem Unlock Folder
        cd C:\Program Files\Autodesk\Revit Server 2012\Tools\RevitServerCommand
        RevitServer unlock -s
        The only way to not allow people on the Revit Server at all is to disable IIS, but that will just create more problems than it is worth.

        On a separate note, there is also a posting on how to limit the size of the Revit Server log files. I found it useful, since these logs get unmanageably large. AUGI posting
        Last edited by Markus; March 27, 2013, 08:24 PM.

        Comment


          #5
          For future outages you should only have to clear the contents of the users_temp folder of the users that were in flight when the outage occurred. That location is the staging area for borrowing transactions

          Comment


            #6
            Markus – Thanks for the script and the link to AUGI. Most useful. I’ll look at adapting it for our environment.
            Jason – In a prior incident, I did delete the contents of the User_Temp folder but no effect. If it happens again I will try that first. Thanks.

            Comment


              #7
              Looking at a bit of an old thread here but figured I would share what I do for a backup.

              Our Host is a VM and it has three drives, all in SAN storage.
              C:/ OS Drive
              D:/ Data Drive
              E:/ Backup_Archive Drive

              What I do is create a nightly backup using Robocopy. It's stupid fast compared to an Xcopy and I have not had issues with data corruption. If the server is locked then the data can't be edited in any way.
              Anyhoo, there is a Windows task that kicks off every night at midnight. It executes a .cmd file that I wrote, shown below
              ******************
              This locks the model data for 2012
              cd "C:\Program Files\Autodesk\Revit Server 2012\Tools\RevitServerCommand\"
              revitserver.exe lock -server


              This calls robocopy to mirror the data to the Backup drive. It also writes a log file to a common location
              robocopy.exe "D:\ProgramData\Autodesk\Revit Server\Projects" "E:\Backup\Projects" /ZB /mir /copyall /R:20 /W:10 /MT:4 /NP /LOG+:C:\Users\Public\Documents\automation\logs\bac kup2012.txt

              This UN-locks the model data for 2012
              cd "C:\Program Files\Autodesk\Revit Server 2012\Tools\RevitServerCommand\"
              revitserver.exe unlock -server


              This does it all again for the 2013 stuff
              cd "C:\Program Files\Autodesk\Revit Server 2013\Tools\RevitServerCommand\"
              revitserver.exe lock -server

              robocopy.exe "D:\ProgramData\Autodesk\Revit Server\Projects2013" "E:\Backup\Projects2013" /ZB /mir /copyall /R:20 /W:10 /MT:4 /NP /LOG+:C:\Users\Public\Documents\automation\logs\bac kup2013.txt
              cd "C:\Program Files\Autodesk\Revit Server 2013\Tools\RevitServerCommand\"
              revitserver.exe unlock -server


              It takes just over two minutes to copy the 2012 project data (40GB) and just under one minute to copy the 2013 project data (10GB). When the Windows task is done, it sends me an email that it succeeded as well Anything that is older than 24 hours gets overwritten every night so there have been occasions where we need to recover a file from tape backup.

              If we need to restore a model from the previous day I will try to get everyone out of the model, lock the model, manually delete the model folder from the server while logged onto the server and then copy/paste the model folder from the backup drive. Of course we make them create new Local Copies as well. Fortunately we have only had to do this a few times.

              When I need to archive a model or a folder I just manually copy the relevant data from the E:\Backup folder to the E:\Archive folder and append the filename with the date. This isn't important enough for me to come up with an automated way to do it.

              What is really the Elephant in the Room is what to do when someone wants to 'Go back in Time' and see a model as it existed at a certain point in time but not mess up the current model (and all of its linked models). I think this would be possible if we renamed the top level folder of the LIVE model data, restored the OLD model data to the Projects folder and then did some creative re-renaming. Has anyone run into this?

              Here's the .xml file of the Windows Task. Sorry it doesn't really paste in very well
              *********************
              <?xml version="1.0" encoding="UTF-16"?>
              <Task version="1.2" xmlns="http://schemas.microsoft.com/windows/2004/02/mit/task">
              <RegistrationInfo>
              <Date>2012-05-01T00:01:27.9271971</Date>
              <Author>INTRANET\jbailly</Author>
              <Description>This task will perform a nightly backup of all data in the Projects folder of Revit Server. First it uses the RevitServerCommand to place a top level lock on the Central Server. Then is uses Robocopy to copy the data to a separated drive. Then it releases the top level lock.</Description>
              </RegistrationInfo>
              <Triggers>
              <CalendarTrigger>
              <StartBoundary>2012-05-01T00:00:00</StartBoundary>
              <Enabled>true</Enabled>
              <ScheduleByDay>
              <DaysInterval>1</DaysInterval>
              </ScheduleByDay>
              </CalendarTrigger>
              </Triggers>
              <Principals>
              <Principal id="Author">
              <UserId>INTRANET\jbailly</UserId>
              <LogonType>S4U</LogonType>
              <RunLevel>HighestAvailable</RunLevel>
              </Principal>
              </Principals>
              <Settings>
              <MultipleInstancesPolicy>IgnoreNew</MultipleInstancesPolicy>
              <DisallowStartIfOnBatteries>true</DisallowStartIfOnBatteries>
              <StopIfGoingOnBatteries>true</StopIfGoingOnBatteries>
              <AllowHardTerminate>true</AllowHardTerminate>
              <StartWhenAvailable>false</StartWhenAvailable>
              <RunOnlyIfNetworkAvailable>false</RunOnlyIfNetworkAvailable>
              <IdleSettings>
              <StopOnIdleEnd>true</StopOnIdleEnd>
              <RestartOnIdle>false</RestartOnIdle>
              </IdleSettings>
              <AllowStartOnDemand>true</AllowStartOnDemand>
              <Enabled>true</Enabled>
              <Hidden>false</Hidden>
              <RunOnlyIfIdle>false</RunOnlyIfIdle>
              <WakeToRun>false</WakeToRun>
              <ExecutionTimeLimit>PT4H</ExecutionTimeLimit>
              <Priority>7</Priority>
              <RestartOnFailure>
              <Interval>PT15M</Interval>
              <Count>3</Count>
              </RestartOnFailure>
              </Settings>
              <Actions Context="Author">
              <Exec>
              <Command>C:\Users\Public\Documents\automation\back upProjects.cmd</Command>
              </Exec>
              <SendEmail>
              <Server>YOUR SMPT SERVER NAME HERE</Server>
              <Subject>Nightly backup of Revit Server complete</Subject>
              <To>[email protected]</To>
              <From>[email protected]</From>
              <Body></Body>
              <HeaderFields />
              <Attachments />
              </SendEmail>
              </Actions>
              </Task>


              Towards the end you can see where i have it send me an email when it succeeds. I haven't tried to figure out how to get it to send me an email if it fails
              In the past, I had it running as System instead of as my user account. It worked fine for months and then started failing. That's when I changed it to run as my user account.

              Hope that helps!

              Comment


                #8
                Really useful. Thank you.

                Comment


                  #9
                  For high risk projects, I also recommend that you schedule an archive of the Revit Server project to a network drive using the Revit Server tool. However, use this with caution, because it should not be run during production time. It actually takes longer than the manual process, and does lock the model during the process. For this reason, use this tool with discretion.

                  Revit Server Model Creation Command-Line Utility - WikiHelp

                  Comment

                  Related Topics

                  Collapse

                  Working...
                  X