Agent goals + Solution sketch

Define the specific goals for the funder research agent and map out the logic and approach to building a prototype.

(Transcript has been edited with the help of Google Gemini to remove filler words)

00:01

Hi everyone, welcome. In this session, we're gonna go over the third stage of the funder research agent building pilot, which is agent goals and solution sketch.

00:18

Just as a reminder, we are in the third step in the design phase of a four-phase process for this pilot.

00:37

So, in this session, we will cover a quick recap of what we did before the holidays in session 2, exploring agent platforms.

00:48

And we will then get into agent goal setting, success metrics, so a couple of exercises within goal setting, metrics, and distinguishing that from signals, the golden path exercise which we'll get into, solution sketch, and then we'll talk about what's coming next .

01:10

with data. So quick recap of the last session, that session was on exploring agent platforms. The kit is available, so if you go to the platform studio and go to exploring agent platforms, you just scroll down, there's a short video summary here, so you can access that.

01:37

And then, there's the kit resources as well as the transcript, and within the kit, you'll find the presentation, which also includes visual versus developer agent building platform reviews that we did, as well as a link to the backgrounder, which we also referenced in session one.

02:00

So those are the resources that are available and posted. And again, just to review, at the end of the last session, we had identified two low-code/no-code platforms that we were going to proceed with.

02:21

And then we'll proceed to prototyping this phase of this pilot project, and that is OpenAI's Agent Builder and Flow-Wise. For a detailed review of why we chose these platforms, do revisit those materials that are available in the studio.

02:44

Okay, so we'll dive in here. Actually, just before we get into the session, from the last session as well, we talked about four data sources and connections that we're going to consider, or that we could use to build out this agent.

03:05

We're going to focus on two: the foundation open data that we worked with in the first session and manipulated with Google CoLab.

03:17

That's what we're going to connect in via a third-party tool. And the external web context; so we're going to look at gathering information because that foundation open data from 2023 can be augmented with more current data from the live web.

03:41

Those are two sources of information. We could also, of course, if we were building this for our organization, connect in our contact management system, information from our email, or file sharing systems.

04:02

In order to do that, you just need to make sure that your organization is clear about how it's handling data privacy and all those pieces, and that should be consistent with your organization's data use policy and AI use policy.

04:18

Finally, user-provided context. We are going to have a user be able to use natural language prompts and input a query into the agent as a trigger.

04:40

Umm, okay, so now we are moving ahead. So we've finished the first phase in terms of the learning and exploring.

04:49

Moving into a design phase of the project now. In this session, we're gonna cover agent goal-setting and go through some exercises to set those goals, and then we will go through an example solution sketch.

05:05

Again, I'm gonna show you an example of these just like all past sessions. Really encourage you to think about following along.

05:12

Feel free to do these exercises with your own organization context in mind. What I'm providing are examples to work with.

05:24

I've provided links to some resources. These will also be linked in the kit.

05:37

The summary is posted to the studio. That'll be before the end of the week. And I'll just go through a few of these resources.

05:45

So the Design Sprint framework. For this session, we are gonna refer directly, and we're gonna use some of the tools that are provided by Google on the Design Sprint site, which we talked about at length.

06:01

As seen in the first session. So I've provided links here to the Define Phase. So all of the references that we're going to be using throughout this session.

06:12

So the Define Phase, again, this is a phase where we are taking everything we've learned in the previous two sessions, our Understand Phase.

06:25

In order to hone in, establish, and define our goals for the agent. Okay, so that's the goal.

06:32

There are a lot of methods and exercises. We're not going to do all of these. In particular, we're going to focus on two.

06:40

One is success metrics and defining metrics versus signals and understanding the difference and identifying our goal and our metrics. As well as we're going to do the golden path exercise, or a simplified version of it.

07:00

Secondly, and I'll just move along here. So the success metrics, the golden path, we're also going to be referring then to the different phase three.

07:13

In this case, we've consolidated, just as a reminder. If we go back to our method, we have reworked these into four steps, and so actually combined this into the third step here, and that's the solution sketch method.

07:41

Okay. In terms of the Miro canvas, we'll be using this for the rest of the session. This is where I've laid out some example exercises for us to refer to.

08:16

Okay, so these just provide links: goal setting, defining the agent solution sketch, as well as the canvas to the Miro board .

08:25

Which is just a workspace area that you can use, there's free versions available, you can copy these resources as you see fit if you'd like.

08:40

Okay, so let's talk about the goal-setting framework. So, the two exercises I mentioned that we're going to be referring to and using are the success metrics and signals, as well as the golden path examples.

09:08

So let's take a look here at success metrics and signals. I think this is a little bigger. So this is a really helpful summary here.

09:19

So we're talking about a goal. We want to be thinking about the big picture. What are we trying to help users do?

09:27

What problem are we trying to solve? We've talked about this again with some ideas around the goals that we have for this agent in sessions one and two.

09:40

The signals. So we want to separate our signals versus metrics. So, have some clear metrics. And so let's talk about metrics first.

09:49

We'll come back to signals. And metrics are determining how we're going to actually measure the change, measure the results. We should have this, this should be quantifiable.

10:00

It's an indicator that we've achieved, or we're achieving our goal. The signal is an indicator that we're potentially heading in the right direction.

10:15

As they said here, we're considering what change in user behavior or opinion would indicate that we've been successful in working towards achieving our goals.

10:26

Okay. So, these are signals we're heading in the right direction. We'll talk about an example for our agent.

10:34

But first, I've laid out what they've provided here. So, to give us a more concrete example of a goal versus signal versus metric.

10:48

In this case, this is the example they've provided about an e-commerce application. So, the goal in this case that Google designs for instance has laid out, is: the goal is to have users start using smart pay to pay their bills.

11:09

How they're going to measure that, the metric would be the proportion of clicks on the action that results in a paid bill, so that's a conversion that they've actually clicked and purchased.

11:25

But the signal might be that somebody is clicking on the action to pay. Okay? So, that implies user behavior that's indicating we're successful in leading towards the goal.

11:37

Alright, so the golden path example. Helpful to pull this up, and we'll look at the directions and actions.

12:03

In this case, our product doesn't exist, so I'm going to focus there. So, if the product does not exist yet, we want to sketch out an ideal path through the new product, and we want to keep it as an ideal path .

12:22

Okay, so we're not going to be worried about diversions, edge cases, exceptions, et cetera.

12:30

So, if we go to our example here that they've provided, in this case, you can see the golden path is what's in yellow, and these reds are what we're not going to go down in this stage for the prototype.

12:46

We just want to focus on the sprint here, the golden path.

12:53

So, a user visits, they search for a product, they view the product, they buy the product, they receive the product, and they love the product.

13:02

So that's simplified. But again, if we're building out a prototype, we want to focus on a minimum viable product, we want to focus on what is our key goal that we're trying to achieve.

13:17

If I know I'm going to get into our actual agent builder success metrics and signals based on that.

13:29

And before we dive into the two exercises, I'll just refer, pulling from the backgrounder. So, if you go to the backgrounder, again linked in sessions one and two.

13:42

I'll also link it in session three. In this case, if you click on this recommendations for participants, this is providing some background.

13:53

It was some thoughts on how we were gonna proceed in this stage, and we can refine these, but we've got some thoughts on success criteria.

14:02

So I'm just referring to, I've basically just copied this section, 8.2, here. So I'll refer to it here, and it's also linked in the Miro board.

14:20

So success criteria we had in the backgrounder: at a minimum, the agent prototype should be able to accept natural language queries. Just a plain language query, so you would type in plain English into a chat.

14:42

Like you would think chat GPT or otherwise, you'd just be able to ask a question in plain language about foundations.

14:50

And in return, you would get relevant foundation matches with reasonable accuracy and be provided basic information on each match, so you'd get things like the name, the giving amounts, focus areas.

15:08

Another thing: you would want to return results that are demonstratively better than a basic search of the source, for example, spreadsheets, Google search, or ChatGPT queries.

15:25

So, meaning there is that if we get better results by just going to the government open data and searching that spreadsheet, either through the web or if we downloaded the CSV file.

15:39

Or, if we could just go to Google or go to ChatGPT, for example, or any other publicly available large language model chat interface, and we would basically ask the same question and we get a better result than our agent, then the agent hasn't achieved its objective.

16:00

Okay, so that's all I mean there. And in this case, while we have these two possible desired outcomes/success criteria in the backgrounder, that doesn't mean we're necessarily that those are both going to be our goal, so we'll come back to that in a second.

16:22

But first, I want to lay out, this is from the backgrounder again, we had a suggested testing approach. So, these are natural language prompts, so "foundations in BC that are funding environmental programs."

16:43

These we're going to find that when we actually get further down the road, we'll actually expand on and make these prompts more meaningful, but these are examples.

16:56

And frankly, these are the kinds of things that many users will type in. They may not flesh out or have a more robust prompt, so we actually want to test this on real-world types of examples that the average not-for-profit sector fundraiser would use.

17:15

And then, from an evaluation perspective, I'll circle back to this in a moment, but just want to reference that that's indicated in the backgrounder.

17:30

Okay, so that's the background information. Now we come to the first exercise, which is our success metrics and signals.

17:41

And if we're referring to the two success criteria, from my perspective, this second one actually sort of encompasses the first.

17:54

In that, if we're focused on getting demonstratively better results than from a basic search of whether that's source materials, spreadsheets, Google search, or other queries.

18:17

Then that will mean, I think, that we will get the relevant information, accurate information.

18:27

And so I would suggest, because I would like to have one goal for this agent pilot in prototyping, that we proceed with the goal of having results returned that are demonstratively better than a basic search of either source spreadsheets or public web tools.

18:47

Okay? So, we could just simplify this, and to either/or, the source data or the public web resources, I don't see any issue with both.

19:05

I'd like to see this prototype outperform both. We're also going to be using both as inputs, so one would hope that we will do better.

19:18

We will see. Signals. So these are things that we talked about indicating that we're heading in the right direction. These are not metrics, and they don't necessarily indicate that we have achieved our goal.

19:40

Now, this example that we're doing here, prototype, which is going to be used by participants in this program who were co-building this together, this would probably make more sense if we were thinking about putting this to a testing group.

20:03

But this is an example, again. And so, a few things that came to mind, just lots more we could put down.

20:16

I'd love to hear your thoughts. But, for example, the number of unique queries per user. So, how many times is a user putting in a question?

20:28

That could be a good indicator, because it means they're using it a lot. Could also potentially be a negative signal, because it could mean that somebody's trying and not getting the results they want, and they keep putting in queries.

20:45

But I think it's a signal of somebody coming back. And they're willing to stick around and use the tool.

20:56

I mean, that's an overall positive signal. We could be looking at, especially if we asked for an email (in this case, I don't think we'll be doing that).

21:09

But we know our testing group, we could ask them qualitatively after they've used the tool. As they're using the tool, how they're finding it. Those could be a good signal.

21:22

Their overall impression, for example, and satisfaction. And again, the number of visits to the prototype chat app, we should have analytics on the page, time spent on the page.

21:36

Again, signals that we're achieving our goal. Metrics-wise, we're coming back to this evaluation score.

21:46

What I'd like to suggest is thinking about how are we going to actually evaluate whether we have achieved our goal?

21:56

We've got a few criteria: relevance of results, completeness of results, response time. Quality of explanations. So that all comes back to measures that will allow us to assess the results of a query to our agent versus the results of a search of a spreadsheet or a Google or broader web search.

22:23

And so, suggesting, and we'll see how it goes, but that we will define a set of test prompts, and we'll do the side-by-side search comparison.

22:40

We'll score those, and I think that will give us a good metric in terms of evaluating in a more systematic way our agent versus a basic search.

22:57

We could also do, I'm not sure we're gonna have the functionality for this in this prototype, but you've likely seen the thumbs up/thumbs down.

23:10

So that's a really helpful quantitative feedback tool.

23:21

So in this case, the way we could do that is perhaps not building in thumbs up or thumbs down into our tool, but we could also have a prompt following it to ask people whether the query gave them the information that they were looking for.

23:40

From a golden path, again, this is a bit of a stretch because it's really quite a simple application.

23:57

And actually, here, we've got... Yes. Alright.

24:15

So, in this case, again, this is just an example. We could add other steps to this, but a user is gonna visit our agent, search, enter a query.

24:23

They're gonna view the results. And that's really the golden pathway: they're gonna receive the results. They could download them.

24:33

They could copy and paste them. And hopefully, love the results. Or not love the results.

24:41

But that's our golden pathway. Examples of things we could focus on, but we're not going to for the prototyping phase.

24:52

These are follow-up questions, diving deeper, things like filtering by geography, grant size, et cetera. So we really are gonna focus on the initial query results provided and the degree to which those compare to basic searches as an MVP or minimum viable product.

25:15

So that's the result for our agent. Okay, so we've talked about goal-setting frameworks, we've talked about agent goal setting, and the golden pathway.

25:34

Now let's get into the solution sketch . Now, I find it helpful and also is the recommended approach for the design sprint to do this physically.

25:57

Ideally, we'd have a team of us that were working on this sprint in a room, be working again as we talked about in a consolidated time frame...

26:04

...but we're doing this remotely and I'm leading the way working through this. So where this might be a whiteboard exercise with sticky notes, that would be for the goal-setting phase.

26:24

For the solution phase, we actually might individually sketch and we would present them to each other. But in this case, I'll talk you through mine.

26:34

And as I said, I'd love to see yours if you've got a different approach or get your feedback.

26:41

So, the first step here is showing that there'd be an interface. So this is just AI, OpenAI's agent builder.

27:01

It's showing a chat window. And I've just got the user prompt. This is just showing the users thinking up whatever they need.

27:12

For example, "find private foundations in BC that fund environmental programs." Button to chat, simple text input, plain language input, no filters by geography or otherwise.

27:27

And that's the trigger. Okay, so that's what starts the process. Then that prompt goes into the agent brain, so to speak, the model.

27:40

So if we're using OpenAI's model, it could be something like GPT 5.1. We'll have different models available to us.

27:49

Depending on the platforms we're using. This is showing a two-way, so we have two streams. Stream A, and down here I forgot to put stream B, but I'll talk to that.

28:04

So stream A is fact-checking. So this is our vector database. This is where it's going to reference the Government of Canada's Open Data set that we manipulated and prepared in session one.

28:23

I'm gonna upload that in the next session. We'll get into actually setting up the vector database in Pinecone.

28:41

That's going to allow us to pull out those verifiable facts and increase the accuracy. However, we know that data information, there's only so many data points in that information, and it's from 2023.

28:50

So there are... so what we would like to do is combine that with live web, live context, more current context from the web.

28:57

So for example, you know, it's not going to be necessarily real-time up-to-date, and again, it's important to note not every foundation even has a website.

29:09

In fact, probably most don't. I don't know the exact percentage, but a lot of these are smaller foundations that are listed, or they're not going to have websites.

29:17

So there may not be a whole lot, but for those that do have information publicly available, it could be as recent as late 2025 or whenever the model is up-to-date. Or actually, in this case, live web browsing will also have the ability to go out to live web and use tools to actually browse even more current information.

29:44

Finally, once we've—again, just to explain these arrows—the model is checking the vector database.

30:05

It's going to check on the web, and we're going to also be able to compare what's on the web with what's stored in our vector database, and that information in the model taken together is then going to output a prospect brief.

30:30

Let's say it's the top 5 or 10, we can talk about how many exist, but then we'd like to have sort of a standard value-added format that lays out information such as status (is it an active foundation), the rationale for why it's being included in the prospects brief and why it would be a potential fit, and additional context. So as we're building out the agent, we'll define this further.

31:03

Okay, so that's the high-level solution sketch. You can see now the next step we're getting into, and we're almost... this is the last step before we're going to start prototyping, building the agent.

31:16

We're going to focus next session on preparing the data using pinecone.io, which is a platform you can sign up to for free.

31:39

I just wanted to point out here the demo account. So you can go to pinecone.io, you could sign up for a free account, and doing that in advance of the session will help you follow along as we go. But no need to do that.

32:05

We'll... you can sign up afterwards, or just follow along on the next session.

32:11

Okay, looking forward to seeing you in the next session, and if you have any questions or feedback in the meantime, I look forward to hearing from you.

Up next

5:02
Data Preparation: Vector Database Setup
Learn how to load raw data into a structured vector database that allows your agent to retrieve and cite facts about the foundations being researched with accuracy.

Program

Exploring the Data

Exploring the Data

4 min
Exploring Agent Platforms

Exploring Agent Platforms

5 min
Agent goals + Solution sketch

Agent goals + Solution sketch

32 min
Data Preparation: Vector Database Setup

Data Preparation: Vector Database Setup

5:02
Agent Prototype

Agent Prototype

Pilot

Pilot