Skip to main content

Querying for items in an Array in CosmosDB

If you have spent any time looking at the documentation for Microsoft CosmosDB / DocumentDB, you will see a lot of examples where the data model has a property named "Tags" that is a list of strings.  But you don't see many times they query on something in that Tag property...  One example I saw a query on Tags[0] = "some value" I don't know how often I will need that, but you know, good to know you can do it.

After looking through the SQL syntax reference.  The 2 ways I most likely query the Tags would be to use a join on the Tags property or use the ARRAY_CONTAINS function.

Side note; the performance of the two methods are basically identical, leading me to believe the query optimizer generates the same instruction sets for both. So unless you have an array of complex objects, just use ARRAY_CONTAINS.

Cool, we know how to query for documents that have our tag on them now... One small problem, when you load a million, or even a hundred thousand documents, your queries start taking seconds to complete, and 10's of thousands of request units.  That's not going to work.

It seams the default indexing policy doesn't index the values in an array, and then that means that CosmosDB is having to open every document that matches your base query.  Thankfully there is a way around this.  If you add a path of "/tags/[]/?" you will find the same query returns in a fraction of a second, and less then 100 RUs.

Sample Index Policy:
{
  "indexingMode": "lazy",
  "automatic": true,
  "includedPaths": [
    {
      "path": "/*",
      "indexes": [
        {
          "kind": "Hash",
          "dataType": "String",
          "precision": -1
        },
        {
          "kind": "Range",
          "dataType": "Number",
          "precision": -1
        }
      ]
    },
    {
      "path": "/tags/[]/?",
      "indexes": [
        {
          "kind": "Hash",
          "dataType": "String",
          "precision": -1
        },
        {
          "kind": "Range",
          "dataType": "Number",
          "precision": -1
        }
      ]
    }
  ],
  "excludedPaths": []
}

Popular posts from this blog

Service to service auth via Azure Active Directory with ASP.Net Core

Sample configuration for ASP.Net Core 1.1 to use Azure AD for Service to Service Authentication.  Update your Startup.cs to have the following public void ConfigureServices(IServiceCollection services) { services.AddAuthentication(); ... } public void Configure(IApplicationBuilder app, IHostingEnvironment env, ILoggerFactory loggerFactory) { app.UseJwtBearerAuthentication(new JwtBearerOptions { Authority = "https://login.microsoftonline.com/{AAD Tenant Name or ID}", Audience = "{Application ID URL}" }); ... } Microsoft.AspNetCore.Authentication.JwtBearer defaults to using OpenID Connect discovery document to validate the bearer token. The Authority is the prefix for the the discovery document.  The middleware will append ".well-known/openid-configuration/" to whatever you pass in to the Authority.  If your IDP has a diffrent endpoint for the discovery document, you can specify the MetadataAddress option, tha...

WebAPI2 and MVC5 with Google OAuth2 : Access and Refresh token security

So recently I started looking at using WebAPI2, and well, the documentation on what's really going on here, sucks.   My goal here is to allow a user to log-in via OAuth2, pull the access token and the refresh token and handle them safely. This post is really just a place for me to take notes as I dig into this. Firstly, this is none trivial in Microsoft's implementation.  After digging into this, I must ask if they even thought about how this would be done.  From what I can tell, they are expecting that if you want to get extra information from a provider that you do it in the OnAuthenticated method on the AuthenticationProvider, and then add it to the claim.  And if that's all you need, by all means, do that. Step one: Requesting the Token For Google we need the include the access type of offline in our request.  It was talked about on the Katana Project at   GoogleOAuth2Authentication never really get RefreshToken So given that bit of info we know...