This is a collection of exercises, answers and explanations to help new KQL users learn how to write KQL.
By executing commands (operators, functions) that appear frequently in actual KQL usage situations from various angles and in various ways, the user is expected to learn the commands by hand.
In KQL, as in any other programming language, each language element is given a name. Remembering these names is not mandatory, but it is a good thing to keep in mind in order to improve learning efficiency. Rather than simply lumping everything together as “commands”, you can quickly gain a better understanding of how KQL works by being aware of which elements you are currently using.
Log Analytics DEMO workspace
This is a demo workspace with sample data. The KQL on this page is intended to be run against this data.
Kusto Query Language
This is the reference for KQL in Azure Data Explorer, of which Log Analytics is a partial implementation.
To search the Azure Monitor logs, you need to know what tables are in the workspace and what structure each table contains records. Here you will learn how to use basic operators to find tables that contain the information you are looking for, and how to examine the structure of the data contained in the tables you locate, using the most commonly used tables as the subject matter.
search "contoso"
Description :
The search operator performs a search on all tables in the workspace if no table is specified.
The search operator can be used when you do not know which table contains the data you want, or when you do not know if the data exists in the workspace in the first place.
It should not be used too often for queries that are executed periodically, as it has a performance disadvantage when searching large amounts of data.
It is also not available in some Kusto execution environments. For example, it can be used in Log Analytics, but not in Azure Resource Graph.
// Prefered
search "contoso" and "retail"
// NOT Prefered
// search "contoso" | search "retail"
// search "contoso" | where Name contains "retail"
Description :
The search operator can generate many results, which affects performance. If multiple conditions need to be specified, they should be included in a single search operator condition. Piping multiple search operators or filtering search operator results by where will result in a large result set being generated by the previous search operator for intermediate processing. In contrast, a single search operator with multiple conditions will generate a smaller result set.
search kind=case_sensitive "Contoso"
Description :
The search operator performs case-insensitive searches by default, but you can explicitly specify the behavior with the kind argument. case_sensitive is a case-sensitive specification.
search in (Perf) "contosohotels"
// able to search piped Table
Perf
| search "contosohotels"
Description :
The search operator can also perform searches on tables received from a pipe. The value of the “in” argument specifies the table to be searched. “search” operator can also perform searches on tables received from a pipe.
search in (SecurityEvent) "contosohotels"
// able to search piped Table
SecurityEvent
| search "contosohotels"
Description :
This example is almost identical to the one before. It is provided to help you understand that records with the same string may exist in more than one table.
search in (Alert,SecurityAlert) "contosohotels"
// Other expression
// Need to combine multiple tables
Alert
| union SecurityAlert
| search "contosohotels"
Description :
The “in” argument allows multiple tables to be specified. In this example, the search is performed across tables with potentially related information. An additional example is the “union” of multiple tables, since KQL does not allow multiple tables to be passed through the pipeline.
search *
| distinct $table
Description :
KQL usually performs searches by specifying tables, it is necessary to know what tables are in the workspace. The search results of the search operator include a $table column representing the table to which it belongs, it is possible to list all tables contained in the workspace by extracting unique values with the distinct operator.
It is very time consuming to check the existing tables from the UI, this example is provided as a simple way to check only the table names.
AzureDiagnostics
| take 10
// limit and take is same
//AzureDiagnostics
//| limit 10
Description :
AzureDiagnostics stores Azure diagnostic logs, which contain a large number of records. The limit operator works the same way as take.
take does not always return the same result. For example, the latest data may not be taken in chronological order, and the result may differ from the previous run even if the data set remains the same.
AzureDiagnostics
|distinct Category
Description :
The diagnostic logs contained in the AzureDiagnostics table consist of logs from multiple resources. Since logs from dozens of different resources are typically aggregated, it is important to understand what resources’ logs are included.
Since the Category column contains hints about the resource that generated the log and the type of log, it is easy to determine the presence or absence of the desired log by listing the Category values in the distinguishing operator.
AzureDiagnostics
|distinct ResourceType,Category
Description :
Since AzureDiagnostics’ ResourceType contains the type of the resource that generated the log, it can be combined with Category and distinct to show clearly which resource is generating which category of logs.
AzureDiagnostics
|where Category == "ApplicationGatewayAccessLog"
Description :
The where operator is a filter that selects records according to given criteria; since each type of log in the AzureDiagnostics table has a different Category column value, you can filter by Category column value to select only the desired resource and desired type of log. The following table shows the number of items that can be selected.
AzureDiagnostics
|where Category == "ApplicationGatewayAccessLog"
|sort by TimeGenerated desc
Description :
Filtered logs are not always arranged in the order expected by the query executor. In particular, they are often expected to be arranged according to chronological order, but this is often not the case.
The sort operator sorts records by the specified column values. By default, it sorts in descending order. In this example, the desc argument is specified to explicitly sort in descending order, but it can be omitted.
AzureDiagnostics
|where Category == "ApplicationGatewayAccessLog"
|sort by TimeGenerated asc
Description :
To sort in ascending order with the sort operator, specify the asc argument.
AzureDiagnostics
|where Category == "ApplicationGatewayAccessLog"
|top 10 by TimeGenerated desc
Description :
The top operator is used to sort records by the column value specified in the by argument. top, like sort, sorts records in descending order by default, and in this example, desc is specified explicitly.
AzureDiagnostics
|where Category == "ApplicationGatewayAccessLog"
|top 10 by TimeGenerated asc
Description :
The top operator, like sort, can also perform ascending sorting with an asc argument.
AzureMetrics
| take 10
Description :
The AzureMetrics table is a table that holds metrics for Azure resources. If metrics are enabled in the diagnostic settings for a resource, information is stored in this table.
Metrics are like performance counters in Windows and are used to measure numerical aspects of a resource.
AzureMetrics
| distinct MetricName
Description :
The AzureMetrics table receives metrics from various resources. Each metric has a field called MetricName that identifies which metric a given record is data for.
AzureMetrics
| distinct ResourceProvider,MetricName
Description :
Since there are quite a few different types of metrics, it is helpful to use the information of the resource provider from which they are generated when listing them.
Viewing unique combinations of ResourceProvider and MetricName in the distinct operator allows you to organize the metrics generated by each resource provider.
AzureMetrics
| where MetricName == "Percentage CPU"
Description :
To display a specific metric, specify any MetricName in the where operator.
AzureMetrics
| where Resource == "CH1-SQLVM12" and MetricName == "Percentage CPU"
Description :
The where operator can filter records by multiple criteria. Resource name and metric can be used to display only any metric for a specific resource.
AzureActivity
| take 10
Description :
This example may not work in the demo workspace, so please try it in your Log Analytics workspace.
[Ref:Send to Log Analytics workspace] Send to Log Analytics workspace (https://docs.microsoft.com/azure/azure-monitor/essentials/activity-log#send-to-log-analytics-) workspace)
The AzureActivity table records operations on subscriptions, such as changing resources or starting virtual machines.
You can check the change work and see if any changes have been made to unnecessary resources.
AzureActivity
| distinct ResourceProviderValue,_ResourceId, OperationNameValue
Description :
_ResourceId is the unique identifier of the resource and ResourceProviderValue represents the resource provider. This query allows you to see what operations have been performed on each resource, organized by provider.
AzureActivity
| distinct Caller,CallerIpAddress
Description :
Caller is the user who performed the execution, and CallerIpAddress is the IP address of the source of the operation. This can be used to check for operations from unintended users or locations.
AzureActivity
| where CategoryValue == "Administrative"
| distinct Caller
Description :
Administrative work includes the value Administrative in CategoryValue.
AzureActivity
| where ActivityStatusValue == "Failure" and CategoryValue == "Administrative"
| distinct Caller,CallerIpAddress, _ResourceId
Description :
An entry whose operation failed will have an ActivityStatusValue of Failure. This query allows you to view users who have attempted and failed administrative tasks.
Reasons for failure vary, but for example, an unusual CallerIpAddress may indicate a security breach.
AzureDiagnostics
| where ResourceType == "SERVERS/DATABASES"
| parse ResourceId with * "/RESOURCEGROUPS/" ResourceGroupName "/" * "/DATABASES/" Databasename
| project TimeGenerated, Category, OperationName, ResourceGroupName, Databasename
Description :
Use the parse
operator to extract the resource group and resource name from the resource id field
Syslog
| where Facility == "authpriv"
| parse SyslogMessage with * "(" session "):" * "user" UserName
Description :
Use the parse
command to extract a string value consistently from a log string entry
VMComputer
| summarize arg_max(TimeGenerated, *) by Computer
| extend IPAddress = tostring(Ipv4Addresses[0])
| extend DNSNAme = tostring(DnsNames[0])
| project Computer, BootTime, IPAddress, DNSNAme, OperatingSystemFullName, PhysicalMemoryMB, Cpus, CpuSpeed, AzureResourceGroup, AzureSubscriptionId
Description :
Use the project
operator to return very specific columns from a query
SecurityEvent
| summarize count() by EventID, Activity
| project-away EventID
Description :
Use the project-away
operator to exclude a specific column from the query result
Azure Monitor logs have a standard defined column, the TimeGenerated
column, which indicates the time generated by the data source, allowing you to see the time of the log record. Using the TimeGenerated
column, you can filter based on time and the number of records.
print now()
Description :
The current time can be obtained using the now()
function. The return value of any date or time function, including the now()
function, is of type datetime
and is expressed in UTC time in Kusto.
print datetime("2022-01-01 12:00")
Description :
The datetime
function allows you to retrieve a date or time by specifying a string. The return value of the datetime
function can be obtained as type datetime
.
print datetime("2022-01-01 12:00") + 9h
Description :
The datetime
type returned by the datetime
function is treated in UTC time. To change UTC time to another time zone, add (or subtract) the time to the value of type datetime
. To add to or subtract from a datetime
type, use a value of type timespan
. Use notation such as 9h
(9 hours) or 30m
(30 minutes), as in this query.
print datetime("now") + 9h - 30m
Description :
You can specify a time in the past or future by adding or subtracting against the datetime
type. Addition to or subtraction from a datetime
type can use values of type timespan
.
Alert
| where TimeGenerated between (datetime("2022-01-01 00:00:00")..1day)
Description :
When checking logs output by Azure services for troubleshooting, etc., you may want to check logs for a specific day. You can use the between
and datetime
functions in combination for a TimeGenerated
column to limit the time range.
Alert
| where TimeGenerated between (datetime("2022-01-01 00:00:00")..datetime("2022-01-02 23:59:59"))
Description :
The datetime
function can be used at the beginning and end of the range of the between
function to make the range include a specific date and time.
Alert
| where TimeGenerated between (datetime("2022-01-01 12:00")..datetime("2022-01-02 12:59"))
Description :
At the beginning and end of the range of the between
function, the datetime
function can be used to specify a range of dates and times for the datetime
function. The datetime
function accepts a variety of date expressions as arguments.
Alert
| where TimeGenerated > startofday(now())
Description :
Comparison operators can be used on the TimeGenerated
type to represent different ranges of time. The startofday
function returns a value of type datetime
representing the start of the date given as argument. That is, “2022-01-01 12:34” represents “2022-01-01 0:00:00.000”.
Alert
| where TimeGenerated > startofweek(now())
Description :
Alert
| where TimeGenerated > ago(3day)
Description :
The ago
function returns a value of type datetime
representing the specified date and time before the current time. That is, ago(3d)
represents the time three days before the current time. In addition to dates, you can specify other time units such as 1m
or 1h
!
Alert
| where TimeGenerated > ago(30m)
Description :
Alert
| where TimeGenerated > ago(1.5h)
Description :
Alert
| where TimeGenerated between (now(-3d)..1day)
Description :
The now
function accepts an offset of type timespan
as an argument. When combined with the between
function, various time ranges can be represented. This query produces the same result as now(-3d)
as ago(3d)
.
Alert
| where TimeGenerated > startofday(now()) - 9h
Description :
The TimeGenerated
and startofday
functions are expressed in UTC time. Therefore, time zone must be taken into account when filtering by local time.
Alert
| where TimeGenerated > ago(30m)
| extend timeAgo = now() - TimeGenerated
Description :
In a table containing access logs, etc., you may want to check the difference between the time of the record and the current time. Using extend
and the time function, you can add a new column and output the time difference.
AzureDiagnostics
| where TimeGenerated > startofweek(now())
| summarize count() by startofday(TimeGenerated)
Description :
The summarize
function is often combined with the time function. It is useful when aggregating access logs, for example.
AzureDiagnostics
| where TimeGenerated >= ago(30m)
| summarize count() by bin(TimeGenerated, 1m)
Description :
Combining the summarize
function with the bin
function and the time function allows more flexible aggregation for dates. The bin
function is an alias for the floor
function.
Heartbeat
| where TimeGenerated > ago (5m)
| project Computer, TimeGenerated
Description :
This query can be used to select one or more columns to show exclusively in the results (Computer and TimeGenerated).
Heartbeat
| where TimeGenerated > ago (5m)
| project VM=Computer, TimeStamp=TimeGenerated
Description :
This query can be used to select one or more columns to show exclusively in the results and rename the column(s)
Heartbeat
| where TimeGenerated > ago (5m)
| project ['Virtual Machine']=Computer, ['Time Stamp']=TimeGenerated
Description :
This query can be used to select one or more columns to show exclusively in the results and rename the column(s)
Heartbeat
| where TimeGenerated > ago (5m)
| project VM=toupper(Computer), TimeStamp=datetime_add("Hour",5,TimeGenerated)
// simplified expression
Heartbeat
| where TimeGenerated > ago (5m)
| project VM=toupper(Computer), TimeStamp= TimeGenerated + 5h
Description :
This query can be used to created columns based on expressions.
Heartbeat
| where TimeGenerated > ago (5m)
| project-away VMUUID, MG, ManagementGroupName, SourceComputerId
Description :
This query can be used to exclude certain columns from the queries
Heartbeat
| where TimeGenerated > ago (5m)
| project-away Resource*
Description :
This query can be used to exclude certain columns from the queries and exclude columns that match the pattern
Heartbeat
| where TimeGenerated > ago (5m)
| project-keep Resource*
Description :
This query can be used to select columns to be shown in the output. Can’t be used to create expression based columns. See project
for that purpose.
Heartbeat
| where TimeGenerated > ago (5m)
| project-reorder * desc
Description :
This query example shows how to reorder (all) columns in descendant fashion
Heartbeat
| where TimeGenerated > ago (5m)
| project-rename VM=Compute
Description :
This query shows how to rename one or more columns in the results. Project-rename won’t affect filtering over columns.
Heartbeat
| extend
Age = now() - TimeGenerated
Description :
This query can be used to create a new column based on an expression
Heartbeat
| summarize LastReported=now()-max(TimeGenerated) by Computer,TimeGenerated
| extend RecentHeartBeat=iff(TimeGenerated > ago(5m),'✔️ ', '❌ ')
Description :
This query can be used to create a new column based on a condition and display a visual clue.
InsightsMetrics
| summarize count() by Computer, Namespace, Name
Description :
Count the number of performance counters collected per computer
InsightsMetrics
| summarize dcount(Name) by Computer
Description :
The dcount()
function is an aggregate function that calculates the number of unique values in a given column. In this example it is the name of the performance counters collected from the computer, but it can be used for a variety of aggregates, such as the number of locations where a user is logged in, the number of IPs to which the computer has ever connected, etc.
InsightsMetrics
| where Name == "UtilizationPercentage"
| summarize avg(Val) by Computer
Description :
Changing the aggregation function given to the summarize
operator changes the aggregation method. By giving the target column name to the avg()
function, you can compute the average of the column values.
InsightsMetrics
| where Name == "UtilizationPercentage"
| summarize max(Val) by Computer
Description :
You can compute the maximum value of a column value by giving the target column name to the max()
function.
InsightsMetrics
| where Name == "UtilizationPercentage"
| summarize min(Val) by Computer
Description :
You can compute the maximum value of a column value by giving the target column name to the min()
function.
InsightsMetrics
| where Name == "UtilizationPercentage"
| summarize avg(Val) by Computer, bin(TimeGenerated, 1h)
Description :
Aggregation with the summarize
operator can be performed on multiple columns. This is a very common pattern, and allows you to aggregate the behavior of a particular resource over time by summing over the resource and time of interest.
If you simply use TimeGenerated for aggregation, the units will be too fine, so it is common to round up using the bin()
function. The example rounds to the nearest hour, but you can also round to the nearest 10 minutes (10m), 100 seconds (100s), etc.
InsightsMetrics
| where Namespace == "LogicalDisk" and Name == "FreeSpacePercentage"
| extend DiskInstance = tostring(parse_json(Tags).["vm.azm.ms/mountId"])
| summarize FreeDiskSpace = avg(Val) by Computer, DiskInstance
| order by FreeDiskSpace asc
| take 10
Description :
The name of the performance counter for free disk space is FreeSpacePercentage
. This example selects the 10 disks with the lowest average FreeSpacePercentage
for the aggregated time period.
In the InsightsMetrics table, volume labels do not appear directly in the columns as values, but are included in the Tags column as JSON data.
For this reason, the parse_json
function is used to parse the Tags column and set the label values of the retrieved volume as new column values with the extend
operator.
let threshold = 20;
InsightsMetrics
| where Namespace == "LogicalDisk" and Name == "FreeSpacePercentage"
| extend DiskInstance = tostring(parse_json(Tags).["vm.azm.ms/mountId"])
| summarize FreeDiskSpace = avg(Val) by Computer, DiskInstance
| where FreeDiskSpace < threshold
Description :
Displays disks whose free space is below a certain value. In this example, the threshold value is stored in a variable named threshold.
Variables can be used to make the query easy to modify, easy to reuse, and easy to understand.
Event
| summarize count() by Computer
Description :
Count the number of events collected per computer
Event
| summarize count() by bin(TimeGenerated, 1h), Computer
Description :
Count the number of events collected per computer, aggregated into 1 hour time buckets
Event
| summarize count() by Computer
| top 10 by Computer
Description :
Count the number of events collected per computer, and then display the top 10 computers collecting events
Perf
| where ObjectName == "Process" and CounterName == "% Processor Time"
| summarize AverageValue = avg(CounterValue) by InstanceName
| top 10 by InstanceName
| order by AverageValue
Description :
Summarize the % Processor
Utilisation for processes across all servers, and list the top 10
SecurityEvent
| summarize EventCount = count() by EventID
| top 10 by EventCount
| render columnchart
Description :
Use the render operator to draw a chart of the top 10 Security Events
SecurityEvent
| summarize EventCount = count() by bin(TimeGenerated, 1h), Computer
| render timechart
Description :
Use the render operator to draw a chart of the security event distribution over time, per computer
InsightsMetrics
| where Namespace == "Processor" and Name == "UtilizationPercentage"
| summarize AverageValue = avg(Val) by bin(TimeGenerated, 1h), Computer
| render timechart with (ycolumns = Computer, series= AverageValue)
Description :
Use the render operator to draw a chart of the processor utilisation over time with aggregations into 1 hour averages, broken down by computer
InsightsMetrics
| where Namespace == "Memory" and Name == "AvailableMB"
| extend memorySizeMB = toint(parse_json(Tags).["vm.azm.ms/memorySizeMB"])
| extend PercentageUsed = 100-(Val/memorySizeMB)*100
| summarize AverageValue = avg(PercentageUsed) by bin(TimeGenerated, 30m), Computer
| render timechart
Description :
Use the render operator to draw a chart of the percentage used memory over time with aggregations into 30 minute averages, broken down by computer. The memory used percentage is a calculated column, as only available memory in MB and total memory in MB is presented
AzureDiagnostics
| summarize count() by ResourceProvider
| render piechart
Description :
Use the render operator to draw a chart of the different sources collected into Azure Diagnostics
Update
| where UpdateState == "Needed" and OSType!="Linux"
| summarize arg_max(TimeGenerated, *) by Title
| where (Classification == "Security Updates" or Classification == "Critical Updates" or Classification == "Definition Updates" or Classification == "Others")
| summarize MissingUpdates = count() by Classification
| order by MissingUpdates desc
| render piechart
Description :
Use the render operator to draw a chart of the missing update count in Update Management
AzureMetrics
| where ResourceProvider == "MICROSOFT.LOGIC"
| where MetricName == "RunLatency"
| summarize AverageLatency = avg(Average) by bin(TimeGenerated, 10m), Resource
| render areachart
Description :
Use the render operator to draw a chart of run latency for logic apps
VMComputer
| project Computer, OperatingSystemFullName, _ResourceId
| join (
InsightsMetrics
| where Name == "Heartbeat"
| summarize LastHeartbeat = arg_max(TimeGenerated, *) by _ResourceId
| project LastHeartbeat, _ResourceId
) on _ResourceId
| project-away _ResourceId1
Description :
Use the join operator to join two tables to see last heartbeat per computer
SecurityEvent
| where EventID == 4624 // sign-in events
| project Computer, Account, TargetLogonId, LogonTime=TimeGenerated
| join kind= inner (
SecurityEvent
| where EventID == 4634 // sign-out events
| project TargetLogonId, LogoffTime=TimeGenerated
) on TargetLogonId
| extend Duration = LogoffTime-LogonTime
| project-away TargetLogonId1
| top 10 by Duration desc
Description :
Use the join operator to join two queries from the SecurityEvents table together to determine logon duration
Update
| where OSType!="Linux" and Optional==false
| summarize hint.strategy=partitioned arg_max(TimeGenerated, *) by Computer,SourceComputerId,UpdateID
| where UpdateState=~"Needed" and Approved!=false
| join (UpdateSummary
| summarize arg_max(TimeGenerated, *) by Computer, OsVersion, RestartPending
| project Computer, OsVersion, RestartPending
)
on Computer
| join ( Heartbeat
| summarize LastHeartBeat = arg_max(TimeGenerated, *) by Computer
| project LastHeartBeat, Computer
) on Computer
| summarize arg_max(TimeGenerated, *), SecurityUpdates = countif(Classification == "Security Updates"), CriticalUpdates = countif(Classification == "Critical Updates"), DefinitionUpdates = countif(Classification == "Definition Updates"), ServicePacks = countif(Classification == "Service Packs"), Others = countif(Classification != "Security Updates" and Classification != "Critical Updates" and Classification != "Definition Updates" and Classification != "Service Packs") by Computer, OsVersion, RestartPending, ResourceId, LastHeartBeat
| project Computer, ResourceId, OsVersion, RestartPending, LastHeartBeat, SecurityUpdates, CriticalUpdates, DefinitionUpdates, ServicePacks, Others
Description :
Use the join operator to join Heartbeat table to Update Management table to view update status, last heartbeat and reboot status
Heartbeat
| summarize arg_max(TimeGenerated, *) by Computer, _ResourceId
| join (
Syslog
| summarize SyslogCount = count() by Computer, _ResourceId
) on _ResourceId
| project _ResourceId, Computer, ComputerIP, OSName, ResourceGroup, SubscriptionId, SyslogCount
Description :
Use the join operator to join Heartbeat table to syslog table to get a count of events per computer
let serviceName= "VirtualDesktop";
externaldata(changeNumber: string, cloud: string, values: dynamic)["https://download.microsoft.com/download/7/1/D/71D86715-5596-4529-9B13-DA13A5DE5B63/ServiceTags_Public_20220228.json"] with(format='multijson')
| mv-expand values
| where values contains serviceName
| mv-expand values.properties.addressPrefixes
| project addressPrefixes= tostring( values_properties_addressPrefixes)
Description :
IP addresses used by Azure services are published as service tags and can be referenced from the Internet.
The externaldata
operator retrieves data described in CSV, JSON, etc. from an external storage account. This sample retrieves files directly from the Internet, but SAS can be used to retrieve data from storage accounts to which access is controlled.
Note - The file name ServiceTags_Public_YYYYYMMDD.json will change as the IP list is updated, so you must always refer to the new file.
Azure IP Ranges and Service Tags – Public Cloud
The files referenced by KQL should be placed in a storage account that you manage yourself, and you should consider a mechanism whereby the files in the storage account are automatically updated whenever the list is updated. Using external data sources to enrich network logs using Azure storage and KQL
print now()
print now(9h)
print ago(9h)
print substring("hello KQL world",6,3)
print 1 + 2 + 3
print iif(true, 100 , 200)
print iif(false, 100 , 200)
Description :
The print
operator performs scalar expressions (calculations that do not create a table with multiple rows). It is useful when including time or string calculations in a long query, and you want to see the results of a partial calculation beforehand.