QUT Distinguished Professor Kerrie Mengersen, Deputy Director ARC Centre of Excellence in Mathematics and Statistical Data speaks to Kate about rich opportunities for business and policy insights from burgeoning sources of data and new techniques of data analytics. Among many examples, Kate and Kerrie cover airport congestion and measuring the impact of the $3.6B Queen’s Wharf casino and retail development project in Brisbane’s CBD.
00:00:07 Kate Joyner
Welcome to QUT Exec Insights, brought to you by QUTeX – professional an executive education for the real world. In this episode we’re continuing with our series, where we share with our listeners the wonderful richness of the QUT Community of researchers and teachers. Those who are making an impact on the real world. I call this thread Cool QUT. With me is Distinguished Professor Kerrie Mengersen.
00:00:30 Kate Joyner
Kerrie’s research expertise is statistical methodology and its applications, and this includes of course, making sense of big data. Kerrie is Deputy Director and Chief Investigator in the Centre of Excellence for Mathematical and Statistical Frontiers, and she has around her a group of about thirty postgraduate and postdoctoral researchers in this increasingly in demand area of expertise.
00:00:50 Kate Joyner
So welcome Kerrie.
00:00:52 Kerrie Mengersen
00:00:53 Kate Joyner
So, a lot of listeners will be saying “Now, this thing about big data…” So I think we’ll jump straight into that kind of area.
00:01:00 Kate Joyner
So, the presentation that I saw you do when I was preparing for this interview was entitled “Now that we have it, what do we do with it? How Maths and Stats can save us from drowning in Big Data.” So, first of all, when we say “big data”, what are we actually talking about?
00:01:19 Kerrie Mengersen
Interesting question. The area, the words big data mean different things to different people and typically we’re talking about data that are unusual for us to deal with. So now we have a range of a whole wide variety and volume and different types of data that we can have access to. So, whereas before we might have been constrained to collecting observational data, now we have sensors, we have mobile phones, we have social media. We have all kinds of different access and sources of data.
So, now the question is, you know, what do we do with all that kind of different data? And so different groups will be able to deal with different types of data. So, for example, satellite data has been around for many years and the groups that work with satellite data know how to deal with it. But for most of us if we got an image from a satellite, we wouldn’t know how to extract information from that. Similarly, there’s brain scans. How do we deal with brain scans? And not just one, but many multiple brain scans over multiple people? How do we deal with data from social media? And so, one of my colleagues Clair Alston-Knox came up with the term “big data is inconveniently large data”.
00:02:44 Kate Joyner
I like that definition. I saw that and I thought that was really … it kind of escapes our normal abilities to…
00:02:50 Kerrie Mengersen
It’s the data that we don’t really know what to do with in terms of size or volume or type or the quality of the data and so on. So, we need to take all of these things into account when we’re trying to really extract the value from the data that we have available to us nowadays.
00:03:07 Kate Joyner
And in business, I mean, we know that some people, some businesses are absolutely competing on their ability to grapple with big data. So, we know that what they call the FANGs, so Facebook, Amazon, Google, so forth. They are really out competing on the basis of actually having sensors in people’s hands. So, they’ve got a really good bead on consumers and their preferences and over time that competition will absolutely, you know, grow in importance. So what’s your observations, certainly in the business field about people’s capability to to make sense of big data?
00:03:45 Kerrie Mengersen
So, the big data is not just for the Google and Amazon companies but it’s also for smaller companies so for example, getting to understand customers, getting to understand trends, need to understand products and so on. But it’s… and so companies at all levels and businesses and governments can all benefit from using data in different and innovative ways. And the sorts of things that I see, the trends that I see are where people were asking the question of “What kind of data can I access?” to now being asking the question of “What do I do with the data that I can access?” because it is much more open now. There’s a big push for open data, for quality data, for repeatable data – so data that you can, or analyses that you can replicate – and so that brings us to the question of trust in the data. So, people are now asking “How trustworthy are my data?”
00:04:53 Kerrie Mengersen
And they’re also asking the question of, there’s big questions around storage and management of those data. And those kinds of technologies, using cloud computing and other sources of storage and management are really now taking off and becoming much more mainstream. And so, the questions now are not around “How do I get data?” or even how I store it, but more about how I analyse it. And so, the real trends now are in analytics. Advanced analytics is really now where the business value is in data.
00:05:30 Kate Joyner
So, I mean that’s really interesting that you said the potential for big data is not just for the, you know the Facebooks and Amazons, but small business. So, even small retail could leverage off the power of the data that they have probably even more freely available to them. So, what capabilities will they require to really extract the value from the data they have available to them? Do they need to access your postgraduate students?
00:05:56 Kerrie Mengersen
(laughs) Well, that’s a very good start. But I think there’s the capabilities now are around, there are capabilities around data Analytics. So, there’s some capabilities around manipulating data, understanding the value of the data that people have and being able to sort of trade on those data. Understanding then about doing basic kinds of analysis with the data, but also understanding that not everybody can do everything and so it’s really entering into partnerships with groups that can analyse data or can help to train people in organisations, because sometimes what we see are people are setting up a data science team for example, which might be the one or 2 people and then they’re in isolation. So, part of that is creating the networks of the analysts so that those analysts can then learn from each other. It’s such a fast moving area and what we want to be able to do is to bring the value to the collective group of analysts.
00:07:00 Kerrie Mengersen
So, part of what we’re doing is not just training our high-end analysts but also then providing more general training about analytics. And also, about smart thinking, too, so that even if you’re not the person doing the analysis, you might at least have a good understanding of the kinds of questions you could ask of the data and what you might reasonably ask of the results. So, you don’t just need to accept results on face value, but you know, you have a good sense of “How do I critically review the kinds of results that I’m seeing from the analyses?” So those kinds of high-level review or critical review methods and skills, and then also of course for the technical people, the analysts themselves about having access to our support and training for skill development.
00:08:00 Kate Joyner
Yeah, so the right kinds of questions and the way to approach data, so all that richness of statistical thinking.
00:08:05 Kerrie Mengersen
Yeah, that’s right, and it really opens up a new way of thinking about things. So, we don’t have to be thinking about our demographics or about our products or about the services or these kinds of things in the same way that we did before. So, for example, the Bureau of Statistics now, and official statistics around the world, are being really proactive in understanding how they might use different sources of data to compile the sorts of statistics that they would generally publish. So, for example, statistics on poverty or statistics on crops, agriculture. Farmers suck at filling out surveys and so how do we, how do we get at that information? Well, we could use remote sensing, so satellite data, coming back to that. So, then then they need the training in how to use those data but then also how trustworthy are those data for official statistics? And then how do managers then actually convert that to you know to use for official statistics and so it’s not just companies but it’s also governments.
00:09:24 Kate Joyner
Governments, yeah, and I was really intrigued by the example that I saw in your presentation about the question how do we understand congestion at airports? Tha’ts practical and that opens up all kinds of questions. It’s not, you would think about, well certain times a day, you know, there’s more flights, but it invites a more systems view, I suppose, those kinds of questions. If you have a more comprehensive set of data that you can exploit.
00:09:49 Kerrie Mengersen
Yeah, and you can imagine at the airport there’s all sorts of data that are available from image recognition of people walking through the airport to the sensors for people in queues, to all the bags that are checked through to flights arriving and the demographics of people arriving. So, lots of information that can be compiled and then used to improve efficiency at the airport and that’s the job of big data and it’s the job of the analysts like us.
00:10:18 Kate Joyner
So how evolved, in your observation, are business, and I’ll go in Australia here, in terms of their ability to extract value. Are we at the well-evolved part of the spectrum or are we lagging other countries do you think?
00:10:34 Kerrie Mengersen
Certainly, there are some other countries that have set up, like, National Centers for Data Science, for example. Australia doesn’t have anything like that at the moment but they do have fairly advanced groups working in this area, so there’s the university groups and CSIRO Data61, there’s also companies that are really quite advanced in this field and world-leading.
00:11:00 Kerrie Mengersen
But generally, you know it’s across the spectrum so while there are leading organisations, then there are also companies that are just sort of starting out on the journey and I think it really is this, creating these networks across different organisations, so that there can be discussions at different levels.
00:11:21 Kerrie Mengersen
So, while some people are still asking the question “What’s the value in doing this?” because there is a setup cost in obtaining those data and, you know, developing the analytic tools for it, right through to the “OK. Now we’re well on the way in doing this and how do we start to think more creatively about what we can do?” So, Australia is like most other countries in that way, in having that range of of…
00:11:49 Kate Joyner
I had heard big data described like teenage sex, you know, everyone says they’re doing it but no one actually is. We’re all still a bit nervous about the whole thing.
00:12:01 Kerrie Mengersen
That’s right, exactly.
00:12:02 Kate Joyner
And also, you mentioned before about open data and opening our data sets, so we have a Public Sector Management Program here in QUTeX so, and I talked about this with our students and they said “Oh, you know, we’re trying to get data sets that we have up and freely available to the community ’cause we don’t know what they might want to do with it.”
00:12:22 Kate Joyner
They may create a use, you know, of all the data that that governments collect but then sometimes months later they’ll say, “Oh, you know, we’ve actually pulled that data set back and we’re not making it available.” What’s your sense of our appetite for making the data that we have in government open and freely available for whomever wants to do creative things with it?
00:12:40 Kerrie Mengersen
It’s a bit of a push-pull system, I think, in that when the data are available then people will… Well, sorry so, so demand creates a supply and also demand creates quality data. So, often people will collect data but there could be quality issues related to it, and then those quality issues really get resolved or data become better quality when people realise that there is a demand for it or it’s being used, of course. We do better when we realise that our work is being appreciated and being used. There’s also then the question of that data aren’t collected for free and there’s a cost to collecting data.
00:13:26 Kerrie Mengersen
And so, I think there’s a lot more realisation about the value of data and that plays out in different ways, so organisations or governments might say “Well, this data, these data are actually valuable and there should be some payment associated with it or some reward associated with providing it.” But then realising that a lot of the value of the data comes when it’s actually, the analysis or the benefit is realised. And so, there is this sort of push-pull about making data, like freely available when it’s being collected. And there’s also privacy and confidentiality issues, of course, and the scale at which the data might be made available. So those kinds of things come into play as well, but yeah, to my mind when data are open, then there can be so much more benefit that can be gained from the collective.
00:14:23 Kate Joyner
Yes, that’s right and who knows what our creative scientists will do with different data sets? To see the intersection and make questions among those. So, you’re doing some research very close to home here at the QUT Gardens Point Campus, which is with the new Queen’s Wharf development, so some of the questions that you are asking with the benefit of data that you can collect is the impact of that Queen’s Wharf development. For those of us who are listening across the world and if you are out there, thank you for that, very close to the, uh QUT Gardens Point is an inner-city campus in the CBD, and on our doorstep will be a multi-billion dollar casino development with retail and some open space and there are a lot of questions to ask about what the impact will be of such a large-scale development. Which of course, one of many large-scale developments happening in Brisbane at the moment. So, what kind of questions do you think you’ll be able to answer? And what kinds of data sets will you pull on to do that?
00:15:25 Kerrie Mengersen
This project was commissioned by the state government and now has the backing also in the involvement of Star, too, so involved in building the project. So, it was really commendable I think, of both these organisations to have the vision to develop this long-term monitoring program. So, another development like this in Brisbane is South Bank, which was developed 30 years ago and so that was for Expo and so at the time when South Bank was developed there was really no concept of having a long-term monitoring of the impact of South Bank. But it has had an enormous impact on our city and so now with this new development which is going to be of comparable size, then there is now the motivation to put in place a long term, 30-year, monitoring program to identify the benefits and impacts of such a development. An urban development.
00:16:29 Kerrie Mengersen
And so, we were asked to set up a plan for that long term monitoring program. And so over the course of a couple of months really sat down and asked “Well, what are the types of economic and social indicators and measures and what kind of data might we have on those?
00:16:57 Kerrie Mengersen
So, we came up with a wheel, and on that wheel were 22 economic and social indicators, and then identified the data sources for those. So, those data sources were from government and also from private industry so part of it was for example, from Telstra about mobile phones and people movement so connectivity around the city. How would such a development improve connectivity? What kind of social media is being used and what signals can we have from that social media about the development? What do we learn about tourist numbers and attraction to the area? What’s happening in terms of jobs, in terms of buildings and that kind of stuff? So, a whole range of different impacts. There’s impacts of employment, but also impacts of gambling.
00:18:00 Kerrie Mengersen
And so, very strong emphasis on looking at the benefits and impacts of gambling and looking at having an Advisory Board that would oversee that the surveys that are being undertaken for that, and also the collection of data from Queensland, but also from Victoria and New South Wales as well and learning from developments in those states to be able to inform this development as well.
00:18:28 Kate Joyner
So it’s the full range of social and economic, technological, political problems. With the richness of all that all that sensors data and data that we can exactly know the full range. Yeah, that’ll be an amazing thing to have a longitudinal study inform.
00:18:46 Kerrie Mengersen
Yeah, I think it will be very interesting to be able to in the future look back and say, “How has it changed our city?”
00:18:52 Kate Joyner
So, with your students Kerrie, and you have a range of people in the centre that you mentor and oversee their research, so how do you see their discipline changing and evolving over that time? So, when we gradually move away from academia – I was saying to Kerrie before, she and I are a month’s in… we’re the same age to only a few months, so – but after we move out of academia and they’re still at their peak, how will that whole area of maths and stats and AI and machine learning change and evolve in the next 5 to 10 years. I imagine it’s going to, you know, I just see this great flowering and fruition of demand for their skills and that they’re discipline changing with the amount of data available to them. Have I got that about right?
00:19:40 Kerrie Mengersen
Yeah, so as we were saying the area that’s really in demand is the Advanced Analytics. And that’s across the range of maths, stats and machine learning techniques and bringing together those different disciplines and really that’s what we’re trying to do in our group, to create the interdisciplinary teams that might be able to solve more complex problems. So, bringing together economists and social scientists and statisticians and machine learners and computer scientists to really ask questions of the problem and to really try to drive new solutions and new insights. We have really also then the ability to bring in, and part of the remit of our group is to have secondments from government and business to come in and learn techniques and to be able to develop for example, policies around big data and the rollout of data analytics in government. So, coming in, being able to be exposed to the different new ideas and ways of thinking and new techniques and then taking those back into industry and government. And that kind of porous boundary between universities and governments, I think and organisations, will be really important in the future. This such a fast-moving area. There’s so much to do and there’s a lot that we can learn from each other so that’s going to be really important.
00:21:14 Kate Joyner
So no one will let you retire, I don’t think. You’ll have to stay with us for a while.
00:21:20 Kerrie Mengersen
The other, I think really interesting thing is that there’s so much online now that can be done. So, I have students who really come in and say, “I don’t need you to teach me anymore.” and this is from our undergrads as well, so, and postgrads as well, so they’ll say “I don’t need you to teach me. I just need you to point me in the right direction.” because there’s so much out there on the web and I know that I need to learn skills. I don’t know what it is that I should learn and I don’t know what the best things, best way to learn like the best tools or the best sites are for learning that, so send me a program, send me send me on my way. And I’ll just work my way to do it. It’s amazing. It’s great to see.
00:22:05 Kate Joyner
Yeah, totally. So, talk about transdisciplinary, so Kerrie finished her presentation that I saw bringing in some aesthetics, which was poetry. And I just thought this brought me closer to understanding the power and the beauty of maths, so I’ll end with Kerrie sharing that with us.
00:22:23 Kerrie Mengersen
Yeah, so this is a poem by John Gillespie Magee, and it’s called High Flight, and the first and last part of it is really beautiful, I think, so it says:
Oh! I have slipped the surly bonds of Earth
And danced the skies on laughter-silvered wings;
Sunward I’ve climbed, and joined the tumbling mirth
Of sun-split clouds, – and done a hundred things
You have not dreamed of …
And, while with silent, lifting mind I’ve trod
The high untrespassed sanctity of space,
Put out my hand, and touched the face of God.
00:23:03 Kate Joyner
And on that note, Distinguished Professor Kerrie Mengersen, thank you for joining us for our 21st episode of QUT ExecInsights.