Couldn’t attend Transform 2022? Check out all the top sessions in our on-demand library now! Look here.
Search optimization is not necessarily new. Cost management in the cloud to identify and control costs for queries is also not new. However, what is new is Blue skya cloud-based workload optimization provider focused on Snowflakelaunched earlier this month to help organizations achieve these goals.
One of the critical elements in the company’s approach is “the algorithms we’ve created ourselves, based on each of our experiences over the past 15 years tuning workloads at Google, Uber, and so on,” it said. Mingsheng HongBluesky CEO.
Hong is a former chief of engineering for Google’s machine learning runtime capabilities, a position in which he has worked extensively with TensorFlow. Bluesky was co-founded by Hong and CTO Zheng Shao, a former senior engineer at Uber, where he specialized in big data architecture and cost reduction.
The algorithms Hong references analyze queries at scale, primarily in cloud environments, and determine how to optimize their workloads, reducing costs. “Individual questions rarely have business value,” Hong noted. “It’s a combination of them working together to achieve certain business goals, such as transforming data and providing business insights.”
MetaBeat will bring together thought leaders to offer advice on how metaverse technology will change the way all industries communicate and do business October 4 in San Francisco, CA.
What is especially interesting is that Bluesky combines both statistical and symbolic approaches to artificial intelligence (AI) for this task, tangibly illustrating that their merger could influence the future of AI in the enterprise.
Cost management of machine learning queries
There are several ways that Bluesky strengthens cost management by optimizing the amount of time and resources spent querying popular cloud resources. The solution can reduce query redundancy through incremental materialization, a useful feature for recurring queries in fixed steps, such as hourly, daily, or weekly.
According to Hong, when analyzing monthly revenue figures, for example, this capability allows systems to “materialize the previous calculation and calculate only the incremental part,” or the delta since the last calculation. When widely adopted, this feature can save a significant amount of fiscal and IT resources.
Bluesky provides detailed insight into query patterns and their consumption. The solution provides an ongoing list of the most expensive demand patterns, as well as other techniques to “show people how much they’re spending,” Hong said. “We break it down into individual users, teams, projects, call centers and so on so that everyone knows how much everyone is spending.”
Bluesky includes algorithms that use statistical and non-statistical AI approaches for profile-driven query cost attribution. Query profiles are based on how much time, CPU, and memory those specific queries require. The algorithms use this information to reduce the use of such resources for queries through tuning recommendations for changing query code, data layout, and more. “Optimization isn’t just computing power,” Hong noted. “We also organize the storage: the table indexes, how you organize the tables, and then there are warehouse settings and system settings that we adjust.”
Rules and guided machine learning
Significantly, the algorithms that make such recommendations and analyze the factors Hong mentioned include rules-based approaches and machine learning. As such, they combine AI’s classical knowledge representation base with its statistical one. There are abundant use cases of such a tandem (called neuro-symbolic AI) for natural language technologies. Gartner has referred to the inclusion of both forms of AI as part of a broader composite AI movement. According to Hong, rules are a natural fit for search optimization.
“This is like query optimization, starting with rules and enriching them with the cost model,” he reflected. “There are cases where it is always a good idea to run a filter. So that’s a good rule. To eliminate a full table scan is always good. That’s a rule.”
Supported learning is added when implementing rules based on cost conditions or the cost model. Eliminating searches with poor ROI, for example, is a helpful rule. Supervised learning techniques can determine which queries fit into this classification by, for example, examining the queries from the past week before eliminating them through rules. “If a search fails more than 98% of the time in the past seven days, you can put such a question pattern in a penalty box,” Hong noted.
The need to reduce operating costs, especially as they apply to multicloud and hybrid cloud environments, is sure to increase in the coming years. Cost management and workload optimization methods that optimize queries are helpful in understanding where costs are increasing and how to reduce them. Relying on automation that uses both statistical and non-statistical AI to identify these areas, while offering suggestions for solving these problems, can foreshadow where enterprise AI is headed.
The mission of VentureBeat is a digital city square for tech decision makers to gain knowledge about transformative business technology and transactions. Discover our briefings.