Walkthrough: Citrix troubleshooting quiz 1
Introduction
I have provided a Citrix hands-on troubleshooting quiz.
Today I will show one solution (out of many possible) and how to analyze these logfiles/dumps.A new Citrix quiz! https://t.co/7owQ1MaMLX And this time it's a hands-on quiz! You have the chance to practice your #ProcessMonitor #WinDbg #CDF skills. It's not super hard, so it's suitable for beginners too! Always happy to get feedback :-) #Citrix #troubleshooting #debugging
— Patrick Matula (@p_matula) June 18, 2023
First challenge: Process Monitor trace
In the quiz, we see a screenshot with the error message: “The system was not configured properly.”. We have received a ProcMon trace for the analysis. So let’s open it.
A few things are good to know:
- The citrix pvs console is a mmc console.
- two services are always kinda important for Citrix PVS:
- Citrix Stream Service
- Citrix SOAP Service
- Citrix PVS uses a database
The Citrix Stream Service is responsible for delivering the data to the target device. In general, this has nothing to do with the console. We, therefore, exclude the Stream Service.
From a process side, mmc.exe
and SoapServer.exe
are therefore interesting.
Now I would scroll up from the bottom (because we assume that the trace ended after the error) and see if anything jumps out at me.
But I don’t see anything right away.
Because we know that everything is in the database, we can check it. The easiest way is to check if there is communication with the SQL server. We don’t know the name of the SQL server or the IP address, but we know that the port is usually 1433. If we filter for port 1433, we don’t see any activity. This is interesting.
As you can see, no lines.
Now we could google how Citrix SOAP service knows which database is relevant. But let’s try it without that.
It’s important to re-enable filesystem and registry activities in ProcMon File. After that, we could simply filter the path on “good guess” for something like “SQL” (results in no results) or “Database”.
And database
results in two lines:
The result is NAME NOT FOUND
and as the name suggests, it just means that the (registry) entry could not be found in this case. If we now compare this for example with a working PVS server (or start Google) we see that the registry hive contains the SQL connection data.
So that seems to be the problem.
So we could answer the questions of the quiz as follows:
3.: SoapServer.exe
4.: registry
5.: HKLM\Software\Citrix\ProvisioningServices\Database
Second challenge: CDF trace
Okay, the second question is a CDF trace from a Director. We want to find out which SQL statement was executed. Therefore we open the CDF trace in the CDFControl
.
First of all, we filter the module to DirectorService
.
We are now down to 30 lines from 138.
A good start, no we filter the Class
to Information
. EntryExit
has only entries like: [t:52, s:unknown]GetConnector returning ...
or [t:52, s:unknown]GetConnector called. connectionAddress = 'http://ctxdcsf1.home.test/Citrix/Monitor/OData/v3'
.
So that’s not too interesting.
We reduced our 30 lines to 16.
We can take a look at the long news and see the two lines:
[t:52, s:unknown]dynamicQueryInput
Filter : LogOnStartDate >= DateTime(638226074143346764) and Session.Machine.DesktopGroup.Name.Contains("ABC")
Select : New(Id as Id, (Session != null ? Guid?(Session.SessionKey) : null)as SessionKey, (Session != null ? (Session.Machine != null ? String(Session.Machine.Sid) : null) : null)as Sid, (Session != null ? (Session.User != null ? String(Session.User.UserName) : null) : null) as AssociatedUserName, (ConnectionFailureLog != null ? Int32?(ConnectionFailureLog.ConnectionFailureEnumValue) : null) as ConnectionFailureReason, (ConnectionFailureLog != null ? DateTime?(ConnectionFailureLog.FailureDate) : null) as ConnectionFailureTime, LogOnStartDate as ConnectionStartTime, ClientAddress as ClientAddress, ClientVersion as ClientVersion, (Session != null ? (Session.Machine != null ? String(Session.Machine.Name) : null) : null) as MachineName, (Session != null ? (Session.Machine != null ? String(Session.Machine.AgentVersion) : null) : null) as AgentVersion, (Session != null ? (Session.Machine != null ? (Session.Machine.DesktopGroup != null ? String(Session.Machine.DesktopGroup.Name) : null) : null) : null) as DeliveryGroupName)
Orderby : Session.User.UserName asc
Skip : 0
Take : 50
----
[t:52, s:unknown]Odata query executed : http://ctxdcsf1.home.test/Citrix/Monitor/OData/v3/data/Connections()?$filter=LogOnStartDate ge datetime'2023-06-17T14:03:34.3346764' and substringof('ABC',Session/Machine/DesktopGroup/Name)&$orderby=Session/User/UserName&$skip=0&$top=50&$expand=Session,Session/Machine,Session/User,ConnectionFailureLog,Session/Machine/DesktopGroup&$select=Id,Session/SessionKey,Session/Machine/Sid,Session/User/UserName,ConnectionFailureLog/ConnectionFailureEnumValue,ConnectionFailureLog/FailureDate,LogOnStartDate,ClientAddress,ClientVersion,Session/Machine/Name,Session/Machine/AgentVersion,Session/Machine/DesktopGroup/Name
Even if you don’t know SQL and you just look at the rows, the following jumps out at you:
Session.Machine.DesktopGroup.Name.Contains("ABC")
DeliveryGroup is internally referred to as DesktopGroup? Sounds plausible. I would therefore assume from the trace that the delivery group the colleague was looking for is called “ABC”.
So we could answer the questions of the quiz as follows:
6.: ABC
Third challenge: Memory Crash Dump
Okay, the final challenge.
BrokerService.exe
crashed and we would like to find out why. I use WinDbg. It is generally a good idea to configure the Symbol Server. This way we get meaningful names.
What never hurts is the command !analyze -v
. It’s an amazing extension that takes away some of the work and gives a good overview.
Some output of it:
...
COMMENT:
*** "C:\Users\administrator.HOME\Downloads\SysinternalsSuite\procdump.exe" -w -e -ma BrokerService.exe
*** Unhandled exception: E0434352.CLR
...
EXCEPTION_RECORD: (.exr -1)
ExceptionAddress: 00007ffe15cbb77c (KERNELBASE!RaiseException+0x000000000000006c)
ExceptionCode: e0434352 (CLR exception)
ExceptionFlags: 00000081
NumberParameters: 5
Parameter[0]: ffffffff80131500
Parameter[1]: 0000000000000000
Parameter[2]: 0000000000000000
Parameter[3]: 0000000000000000
Parameter[4]: 00007ffe084b0000
PROCESS_NAME: BrokerService.exe
EXCEPTION_CODE_STR: 80070002
FAULTING_THREAD: ffffffff
STACK_TEXT:
00000094`9ac3eb40 00000000`00000000 BrokerComponent!Citrix.Cds.Broker.BrokerComponent.Initialize+0x1
00000094`9ac3ebf0 00007ffd`a8e85d9d BrokerService!Citrix.Cds.CdsController.ControllerService.InitializeService+0x1fd
00000094`9ac3ec70 00007ffd`a8e84288 BrokerService!Citrix.Cds.CdsController.ControllerService.ControllerManager+0x38
STACK_COMMAND: !C:\ProgramData\Dbg\sym\SOS_AMD64_AMD64_4.8.9166.00.dll\64532EE79a3000\SOS_AMD64_AMD64_4.8.9166.00.dll.pe 0x2899fc79c38 ; ** Pseudo Context ** ManagedPseudo ** Value: ffffffff ** ; kb
...
Okay, two things to notice:
- We can see how the memory dump was recorded (is under
COMMENT:
) - Apparently we have the unhandled exception
E0434352.CLR
here.
The good thing about Windows Debugger (even if you are a beginner) if you google it, you can usually find very good information. Nobody writes about “E0434352.CLR” by accident.
So, let’s google it.
My first result is an 8-minute video about this exception. From the excellent Inside Show from Microsoft.
Whether you watch the 8-minute video and learn a lot or come to it via WinDbg Cheat Sheets. Sooner or later !PrintException seems to be interesting.
The message is Service failed due to internal error.
- okay good to know but in this case not super helpful. But the InnerException
seems interesting: System.IO.FileNotFoundException
. So maybe a file is missing. WinDbg is so nice and even tells us that we can click on !PrintException 000002899fc79c38
to learn more.
Exception type
is now System.IO.FileNotFoundException
- that’s something we already knew but the message is interesting Could not load file or assembly 'BrokerFiltering, Version=7.36.0.0, Culture=neutral, PublicKeyToken=a80ce61cfbf8b47a' or one of its dependencies. The system cannot find the file specified.
.
If we now check our system, we will see that indeed the file BrokerFIltering.dll
is missing.
So we could answer the questions of the quiz as follows:
7.: FileNotFoundException
8.: BrokerFiltering.dll
Final words
The first challenge is admittedly difficult if you don’t know anything about Citrix. I think you have to invest the most time here. If you know CDFControl, the second challenge is relatively quick. And the third challenge is a classic example of a crash dump analysis (managed code).
I’m still trying to learn a few things myself, so if you have any tips for even better and faster analysis: Let me know!
Happy Troubleshooting!