fleet icon indicating copy to clipboard operation
fleet copied to clipboard

Data quality bugs in several osquery tables

Open noahtalerman opened this issue 11 months ago • 2 comments

osquery version: 4.2 (old version but after testing the issue seems to still exist)


💥  Actual behavior

See actual behavior in the osquery issue here: https://github.com/osquery/osquery/issues/6319

It looks like some of these columns are only supported on specific platforms.

For example, the speed and physical_adapter columns in the interface_details table are only supported on Windows (according to Fleet's schema): Screenshot 2024-02-26 at 9 49 37 AM

However, when you query these columns on macOS, you get empty results instead of an error:

Screenshot 2024-02-26 at 9 52 33 AM

🧑‍💻  Steps to reproduce

Run scheduled queries against the tables reported in the osquery issue, with --verbose flag on.

🕯️ More info (optional)

Recently discussed in Hacker News here.

🛠️ To fix

For each buggy table presented in the osquery issue, either fix the table to return data (if possible) or a clear and easy to understand error message.

noahtalerman avatar Feb 26 '24 14:02 noahtalerman

From @lucasmrod: this could be a breaking change. cc: @noahtalerman

sharon-fdm avatar Feb 29 '24 20:02 sharon-fdm

@lucasmrod is this the expected behavior? For example, returning empty results when querying a Linux host w/ columns that are only support on Windows?

noahtalerman avatar Mar 01 '24 00:03 noahtalerman

@noahtalerman we are not sure what needs to be done here. Adding product label and assigning to you to check.

sharon-fdm avatar Mar 06 '24 19:03 sharon-fdm

@lucasmrod please add details and questions.

sharon-fdm avatar Mar 06 '24 19:03 sharon-fdm

Hey @lucasmrod and @zwass, is the current behavior documented in this osquery issue the expected behavior?

My understanding of the current behavior: when querying a Linux host for columns that are only support on Window, I see empty results for these columns.

Is that the breaking change?

If it is breaking, can't we still return an easy to understand error message instead?

noahtalerman avatar Mar 12 '24 13:03 noahtalerman

@noahtalerman when filing issues, can you please include the query you ran?

Based on the screenshot and looking at the osquery schema, I am guessing the query is SELECT speed, physical_adapter FROM interface_details;

I dug up the original PR implementing the platform-specific columns and this seems to be intentional behavior (and also unchanged since implemented 7 years ago).

I'm not sure what you mean by "is that the breaking change?"

zwass avatar Mar 12 '24 16:03 zwass

I am guessing the query is SELECT speed, physical_adapter FROM interface_details;

@zwass yep, that's the query! Sorry for not including it.

this seems to be intentional behavior (and also unchanged since implemented 7 years ago).

Got it. So this would be an improvement/feature request instead of a bug.

It sounds like the current behavior, returning empty results, is confusing (see comments in the osquery issue here and HackerNews here)

I'm not sure what you mean by "is that the breaking change?"

Let's say we want to update the behavior to return an error like "columns not supported: physical_adapter, speed" when I run the SELECT speed, physical_adapter FROM interface_details; query on a macOS host.

Is that a breaking change?

noahtalerman avatar Mar 13 '24 15:03 noahtalerman

@noahtalerman is there a specific comment you are referring to on HN? If so, can you please link it?

The osquery issue seems to be mostly about the warnings that are generated by casting errors? That seems to be related (because it's sometimes triggered by empty columns) but not the same as the platform-dependent columns issue your query triggers.

Your proposed change is definitely a breaking change. It would break platform compatibility for queries that explicitly select platform-dependent columns.

The current behavior is desirable for the following reasons:

  1. I can write a query that explicitly selects platform-dependent columns and still have the query return (partial) results rather than erroring out and having to be written separately for each platform. For example (interface_details table): SELECT interface, mac, link_speed, manufacturer FROM interface_details -- this query runs on all platforms and returns the results as available on the platforms. With your proposed change this query would no longer be compatible with any platform.

  2. If I SELECT * I only get the columns that are actually supported by the platform. For example on macOS:

osquery> select * from interface_details limit 1;
  interface = lo0
        mac = 00:00:00:00:00:00
       type = 24
        mtu = 16384
     metric = 0
      flags = 32841
   ipackets = 4402768
   opackets = 4402768
     ibytes = 3682536448
     obytes = 3682536448
    ierrors = 0
    oerrors = 0
     idrops = 0
     odrops = 0
 collisions = 0
last_change = 1706663951
 link_speed = 0

zwass avatar Mar 13 '24 17:03 zwass

I can write a query that explicitly selects platform-dependent columns and still have the query return (partial) results rather than erroring out and having to be written separately for each platform.

@zwass this totally makes sense.

the warnings that are generated by casting errors

Can we return a better warning in this case? (if we know these are unsupported/missing columns?)

Something like: "columns not supported: speed"

Instead of: "Error casting speed () to INTEGER"

noahtalerman avatar Mar 22 '24 14:03 noahtalerman

I tested and couldn't reproduce the logging, then dug in a little bit and noticed that the logging issues had already been fixed: https://github.com/osquery/osquery/issues/6319#issuecomment-2019065895

I closed the osquery issue. I think we should close this one unless we want to propose some other way to handle platform-dependent columns in osquery.

zwass avatar Mar 25 '24 23:03 zwass

Thanks for digging into this @zwass :)

If I'm understanding correctly, this is how osquery handles platform-dependent columns (when query is run on an unsupported platform):

  • Null or empty results are returned
  • Nothing is logged

Please correct me if I'm wrong.

@nyanshak, what do you think? Any feedback?

noahtalerman avatar Mar 26 '24 13:03 noahtalerman

UPDATE: Closing this issue.

The logging issues in osquery are already fixed: https://github.com/osquery/osquery/issues/6319#issuecomment-2019065895

This is how osquery handles platform-dependent columns (when query is run on an unsupported platform):

  • Null or empty results are returned
  • Nothing is logged

noahtalerman avatar Apr 01 '24 13:04 noahtalerman

Osquery bugs fixed, Clear warnings now provide aid, Fleet's path is vivid.

In the glass city, Data errors find no place, Clarity's embrace.

fleet-release avatar Apr 01 '24 13:04 fleet-release