Can AI Detect Developmental Delays at Home? with Bharath Modayur of Early Markers | Your AI Injection

What if you could monitor your baby’s development with just your phone?

In this episode of Your AI Injection, Deep Dhillon talks to Bharath Modayur about Early Markers’ vision to bring developmental screening into every nursery. Their system helps parents identify delays in motor skills and other milestones by coming AI with clinically validated tools. Bharath shares how real-time feedback and targeted activities may enhance outcomes for infants, while reducing anxiety for parents. 

Learn more about Bharath here: https://www.linkedin.com/in/bmodayur/
and Early Markers here: https://earlymarkers.com/index

To find out more about how to use pose analysis to analyze activities, visit our website: Human Pose Analysis

Or check out our related content: 

Get Your AI Injection on the Go:


[Automated Transcript]

Bharath: The more time the kid spends in asymmetric poses in the crib is correlated with higher risks for autism. So these are markers. Again, it's just something to look out for. And cumulatively, they lead, to a specific condition. motor is interesting because, there are a lot of, development in other domains that cascade from development of motor skills.

they call it like the sticky mitten experiments where they attach something to the infant, like three to five months old, something like a Velcro glove, It allows the kid to reach and essentially grab even before they have skills to grab. longitudinally, they look at these, three month olds that have been trained To reach for things and how they develop at 13 to 15 month old, and they find that their ability to explore objects is clearly improved because of that early training.


Check out some of our recent podcast episodes:


Deep: Welcome to this week's episode of your AI Injection. This week we've got Bharath Modayur. Bharath received his PhD from the University of Washington, where he is currently an affiliate assistant professor. We've got Bharath on to chat with him about his startup Early Markers, which is leveraging AI technology to detect developmental issues in infants.

I'm super excited to chat with Bharath about the great work that Early Markers is doing and the AI. Powering it.

All right Bharath thanks so much for being here and I'm super excited to talk about Early markers. Let's get started. Why don't you tell us a little bit about your early inspiration for the project?


Xyonix customers:


Bharath: Yeah, first of all, it is exciting to talk to you guys. I always get supercharged after these conversations

The short story is that my expertise is, I'm a computer vision guy, mid nineties, PhD from the university of Washington. And, there was not really that many avenues to, use. the skills you learn, from grad school computer vision at that time to something practical.

So I hadn't done much with computer vision for earlier part of my career. Right around 2012 there were some demos at the university of Washington electrical engineering department. They were doing post estimation on adults. So just, figuring out where the body joints are.

Of adults moving around typically in the upright position. They had applications for surveillance and, suspicious activities, and that led a spark. I did a proposal to DARPA at the time, to develop something along those lines and the ideas were accepted, and then we realized that.

You're transferring all your IP over to DARPA and it's not something that I really understood. So backed off a little bit and, some mentors in my circle pushed me towards, healthcare. And the, UW autism department. So I hooked up with the researchers there and look for ways where we can use the computer vision video processing to look for early markers, early signs of autism.

That was the germination of the idea. Key thing being that there was a lot of anxiety, especially for. Parents that already have a child with autism the likelihood goes up to 17 percent if your older sibling is diagnosed with autism.

And so how can we help them look for signs early and what can parents do at home to engage with a child to develop these skills like motor skills and social skills. That was kind of the idea. we have branched off from looking for something very specific to autism to generalize developmental, atypicalities and developmental delays.

That's what we're focused on

Deep: Maybe walk us through what is the benefit of early diagnosis first?

And then, walk us through like, what are the markers you're looking for through the video signal at a baby and how do they actually predict,

Bharath: In general, early detection can lead to early intervention, and early intervention is certainly linked to better health outcomes. So for instance, there's about, 15 to 20 percent of children have a developmental or behavioral, disability and only a third of them get diagnosed before they enter, The school system. Early detection and early intervention can save society up to like 100, 100, 000 in social service costs. That is the motivation for, looking for, delays and markers early so that you can get to a diagnosis and intervention early.

It's better health outcomes and, you save money as well. I want to be clear about the fact that markers are not diagnosis, right? Just because you find something that is a marker for diabetes, doesn't mean that you have diabetes.

that's a distinction, that we typically have trouble with when we write NIH proposals the conflation of markers with diagnosis, we have to be very clear about what is what. the earliest markers that you can typically see are motor related.

Social and communication skills come later for the infant. markers linked to autism are delayed development, and less time spent in advanced postures like tummy time, that is linked. And then later. You have atypical gait, asymmetry, and there is this, again, it's, it was not a big study, but it gave tantalizing clues about how the infant lying down in the, in the crib, and you look at the, posture of the infant to see whether it is symmetric or asymmetric.

 The more time the kid spends in asymmetric poses in the crib is correlated with higher risks for autism. So these are markers. Again, it's just something to look out for. And cumulatively, they lead, to a specific condition. motor is interesting because, there are a lot of, development in other domains that cascade from development of motor skills.

they call it like the sticky mitten experiments where they attach something to the infant, like three to five months old, something like a Velcro glove, It allows the kid to reach and essentially grab even before they have skills to grab. longitudinally, they look at these, three month olds that have been trained To reach for things and how they develop at 13 to 15 month old, and they find that their ability to explore objects is clearly improved because of that early training.

It shows that motor development leads to, development of other domains like language communication and social interaction. the ability to sit independently. And, point to things that kind of like it, it improves social bidding, it leads to interaction with the adults and, and just the ability to crawl independently and later walk allows the baby to explore and interact with the environment.

And just those interactions drive. Development and other domains. So it's motor development is pivotal. And from computer vision perspective, it's also convenient for us to passively look at a baby and glean things, instead of actively fitting them with wearables and stuff.

Deep: So before we dig into the, how are we doing this with machine learning part, how does autism typically get diagnosed today and what role does this kind of infant level behavioral.

analysis play or, postural analysis?

Bharath: Typical diagnosis is, on average, three plus years, even though diagnosis can be performed at 18 months. If we're not talking about autism specifically, it's, earlier things that you can find in terms of delayed achievement of milestones.

It keeps the parent engaged so that, during well child visits or through observations and interactions, the parent is able to Look for these early signs, right? And that is especially crucial for a parent that already has a familial history of autism that they can look for these, early, markers, if you will.

Right.

Deep: Yeah. It sounds like you're saying, People are not identifying these conditions early enough maybe there's some parents who have a former child, a first child or earlier child that has a condition so they know to look for it, but there's a lot of folks who don't.

And that if we can't identify these folks, because we can make it lower cost, And they don't need to know, yeah, convenient, they don't need to know ahead of time what to look for and then to take them into their physician and get maybe routed to some kind of behavioral, occupational specialist or what have you.

Is that basically the point here?

Bharath: Yeah. The convenience and the passive observation and not really expecting the parent to know everything and guiding the parent. through, activities at home that kind of promote development along all these domains, right? Even the diagnosis of autism, it's now possible to do it younger than, say, 10 years ago.

But parents have to be motivated or concerned, for you to get there and they have to think

Deep: then they have to kind of know what to look for, which requires an a priori education.

Bharath: the American Academy of Pediatrics.

Recommends, developmental screening for infants, especially motor screening. there are about six annual well child visits that are recommended in the first year of life. And on average, parents make it to 2. 2. Which means that you're not even being seen by a clinician, the more tools we provide for the parent and the clinician to perhaps even remote consult, and look at the baby, without the parent having to go to a clinic, all of those are lowering the barriers and exposing, the child to being observed by a clinician early and often.

Deep: That 15 minutes, once every few months, observational window might not be enough for the physician to notice anything. Is that correct?

Bharath: And that kind of goes to the heart of, what we have developed and what we have in the pipeline is about being able to observe the baby in a naturalistic environment, which is home, and being able to observe the baby for longer periods of time and more often, right?

you don't know, what the, circumstances will be during a clinic visit, The mood of the baby at the time, it's a compressed timeframe. the baby may not be at the best state to express whatever skills that he or she has. So the more opportunity you give, the infant to be observed express their skills, the better off you are in, recording something that's, more accurate than a clinical setting observation.

Deep: let's change gears a little bit. for the sake of our audience, we're going to talk about an AI technique called, body pose analysis, if you think about the human body, you've got all these joints on it, you've got a, a skeleton that you can lay over that and track over time.

So you can see the joints, moving across time. And then you can start to assess whether or not there's some asymmetry, like Bart's talking about, or whether or not, a baby is laying down or crawling, et cetera. So let's start on the, the skeleton extraction itself, the pose extraction, what are some of the challenges about, babies versus, adults with respect to actually getting the pose out in the first place.

Bharath: They are more varied, different proportions and squishy, if you will. Um, and so when we started this out, there was a phase one, NIH grant. we had this idea to recognize, where the baby is where the baby parts are, where the baby joints are and, what the baby is doing in terms of.

specific activities So that was the thesis. And then we came down to implementing the technology. a brief history is that we approached it from a classic computer vision or what they call machine vision, approach where, You're trying to look for specific features in the image like, does it look like an elbow joint?

Does it look like fingers and does it look like a face? then you extract these features, put them together and then you constrain them and say that, these features can't be just. occurring everywhere. So if you're pretty sure that's the head, below that will be the shoulder, et cetera.

you're constrained by the human anatomy. and of course, most of these, classic algorithms were written for people sitting at a desk walking around or doing common activities, whereas the observational videos of babies, they're not sitting.

Most of them are crawling around and there are not enough, humans, for us to work the classic, computer vision way, later when deep learning exploded and, people are able to do some, previously undoable things and recognizing, animals and people and Where people's joints are, the main technical problem we ran into was we don't have the data. We don't have infant data. And infant data is hard to come by. for obvious privacy reasons, you don't find a lot of them on YouTube. so we decided to do our own study and, we recruited and, imaged.

68 infants in a one year period. And we have about 30 to 40 minutes of data on each one of those infants. And we used a Microsoft connect at that time. So we had not just the, color images, but also information about the depth of each pixel in the image. it gave us additional information.

We went through this extensive period of. Labeling these, images in terms of where the different joints are. That's pretty painstaking. but you know, you can use human resources. You don't need much training. Everybody knows what a baby looks like and where the baby's, joints are.

you could get a high school or middle school student to do that. But then comes the second part, which is. labeling specific parts of the video where the baby's doing something interesting. for instance, it's the baby rolling over and, there are three, different kinds of rolling over.

There are nuances that only a developmental expert will be able to tease out. so we use expert time to annotate. Parts of the video where specific things are happening, once you have all that, we have, these deep learning systems that we train on these data to a figure out where the body joints are and to see where specific motor, activities are taking place.

Deep: so part of it is you can't just go use off the shelf, body pose. Packages and apply them to babies, and get too far. It sounds like, and then the second part here is, so you went out, you started gathering your baby data, and trained up your own models for pose extraction on the babies and you've got, now you've got the ability to attract the skeleton.

What is it you're trying to do next? Once you've got the skeleton, walk us through the actual poses you're trying to get to. And some of the challenges, there and getting from the skeleton over time in a video to the postures you're after

Bharath: So the solution we're developing is, for a clinician to be able to do a standardized, clinically validated, assessment. which is objective. It is not just looking at the baby and saying, looking at a checklist and saying, the baby is, five months old. So is the baby rolling over?

Yes or no? And you ask the parent and those are like questionnaire based assessment of the child, not as accurate as something that, is observation based and that uses a clinically validated tool. What we are trying to do is see whether the parent themselves can do the administration of this test.

So think of it as a, clinical tests that the baby goes through, which is just all passive observation by an expert and occasionally facilitating movement. You can't just ask the baby to do something. So you have to provide some cues, rattle some things and give some enticements. So the baby crawls or rolls over or reaches for stuff.

So right now that is done in a clinic. under clinician supervision they look at how the baby behaves score the baby and then you get a percentile ranking and then you figure out whether the baby needs further evaluation or specific intervention or everything is going fine let's see you again during the next well child visit our first idea was Can we have the parent do it at home over time?

that removes the constraints of a clinical setting. the baby doesn't have to travel. can be home, with a mom or dad or caregiver.

Yeah, you don't

have to

Deep: coordinate schedules. Correct. You don't need the extra time. COVID problems, all that stuff.

Right,

Bharath: so we, Let this assessment into administration and evaluation and made them asynchronous. They don't have take place at the same time. And for administration, our key idea is, can we just have the parent do it? first part of the AI is, for the app to be running on a tablet laptop or phone, it recognizes that the baby is for instance, on its back.

there are a set of cues. Now you can give the parent voice guidance where it can tell the parent, Hey, it looks like, Layla's lying down on her back. can you rattle something on her left and see if she reaches for it? And then can you rattle something on her right and see if she reaches for it?

That's an example of figuring something out and near real time as to what, how the baby's posture is located and giving specific ideas to the parents so that they can elicit some movement right? There's

Deep: probably a number of things that go into deciding what exercise you provide the parent.

the age of the child. what exercises you already have data for. And which ones, you don't,

Bharath: These are standardized, tools, observational tools that clinicians use.

we're just facilitating conduct of this at home, which hasn't been done. once you do that, you give the parent, about a week. they don't have to make time and sit down with a baby for one hour that is part of the constraint you have in the clinic it's time limited.

So you observe the baby longitudinally a little longer. that is the first part of the AI. to look at a video at home and tell whether the baby is in one of these canonical poses, if you will, is the baby on its tummy on a back or sitting or standing or so on based on that, you give very context specific cues to the parent to elicit movement from the baby.

The second part that we did using a lot of data So this tool for assessment is called the Alberta Infant Motor Scale. It's called AIMS, A I M S. That's got 58 specific motor items. Crawling, for instance, will be one item and there are different kinds of crawling, actually. crawling to sitting will be another item and rolling from back To, Tommy will be another item. So these are examples of, what constitutes those 58 items, if you give a video, of this, assessment to an expert, the expert knows to recognize any one of these 58 items.

So if you want to automate this, you need a system that can classify or recognize. Any one of these items from video and that is a herculean task at this stage, machine learning that can, you can give it one of these arcane items out of the 58 and the machine says, Oh yeah, it is item number 17.

We cannot do that. So what we have done is reduce the 58 to 15 items. using just 15 items picked by the machine, you can still do really well, you don't need 58 items even within the items, we kind of mash together items that are hard to distinguish, even for an expert.

An item and say, I think it is 18. another expert would say, no, I think it's 20. So it could be two different kinds of trauma.

Deep: Two different kinds. I was going to ask you for an example there. And so it's,

Bharath: is the baby leading with a hip, or with a shoulder.

I have looked at these for several years and I can't tell. with good reason. I'm not a developmental expert, but even they differ in whether it is, leading with the hip or shoulder. So what we did is combine those items. For instance, We just have one kind of rolling over.

We don't try to distinguish between the various kinds of rolling over the idea being that it becomes a little easier for the machine to, chronologically see that, baby's supine. Hey, now the baby is prone. Very likely the baby rolled over and it's a supine to prone rollover.

I don't know anything else about how exactly the baby rolled over. the machine doesn't know, but it becomes a little easier to, train a system that just says I found a rollover. I don't know what kind of rollover it was. So we use machine learning just to reduce the number of items that we have to recognize to arrive at a developmental score.

For the infant, right? and then the final part of using the AI is even now, we're not thinking full automation where you pump in the video and outcomes to score. We are not there yet. That is the goal.

Deep: what exactly is The score and what's the connection between the score and some of the conditions that you might have like autism or cerebral palsy or etc.

Bharath: Yeah so, there is no cut off. Motor score Is one way of measuring objectively, the, development trajectory as it pertains to motor development of an infant that is derived from a, standardized, tool that has been norm referenced on 1200 infants or something.

you get a raw score and based on the age of the baby, They'll fit somewhere

Deep: on a Gaussian distribution. Yeah, you get a

Bharath: percentile score and clinicians typically have a threshold that minimizes, the false positive and false negative rates.

for instance, they could use the 10th percentile as a cutoff and say, if you're below the 10th percentile, we need to have additional, evaluation of the baby, maybe a full battery of tests that not just motor, but, speech, communication, behavior, et cetera. And that takes, more resources, more time commitment to do, but it becomes a trigger for, triaging further care.

So if you're below the 10th percentile, It doesn't automatically mean that,

Deep: condition

Bharath: it is something that warrants further evaluation and further devotion of resources to, evaluating the

Deep: I'm curious about the flip side. Let's say you're in the I don't know, 18th percentile, but you're a concerned mom on you want to intervene anyway.

Are there benefits to getting infants more mobile earlier, regardless of these more serious conditions?

Bharath: I don't know whether it's controversial to suggest that, you need to do that early enough and there is just one. Pathway or trajectory of development.

That's acceptable. I think it depends on the concern level of the parent, right? at least in Washington state, parents can self refer. You still have to get in the queue and there are clinics like kindering Bellevue that will take you and put you through the full battery of tests.

I think one of the benefits of, making this convenient, doable at home and low costs that you can have a longitudinal observation of the child. Just because you're the 10th percentile, one month doesn't mean that is your trajectory, You need a few data points to see whether your concern is validated and you don't have to wait for the next.

clinic visit for you to get additional data about your child. So that is the benefit of offering a tool that, depending on your level of concern or how proactive the parent is, they can go through this and figure more out based on the data. the final point I want to make is that instead of full automation, like that's where we're talking about the motor score and the percentile, and the cutoffs and all that.

The final part of the AI is allowing a clinician or a developmental expert to look at these videos rapidly. And, be able to arrive at a evaluation report, if you will, and, currently, if, a 30 minute video takes 45 minutes to evaluate, our job is to iteratively reduce the time it takes, to evaluate the video, if, AI can run through the video in advance, cue up the interesting or salient parts of the video. then,

Deep: yeah,

Bharath: and then

Deep: kind of solicit have guesses. Like I imagine you can just make the developmental experts more efficient. Absolutely.

Bharath: Correct. Because

Deep: in this kind of passive observation environment, you could have hours of time where the baby's just asleep or like, maybe they did 10 of this particular type of activity and you really are into the minority activity.

even if the machine is wrong a few times, it's still going to save in theory, the developmental expert a fair amount of time.

Bharath: And sometimes you may miss something. for instance, a week ago, when a baby's video was scored.

looking at the breakdown of the scores, the clinician thought, did the baby spend any time at all standing? And it doesn't mean that the baby has to be independently standing. The parent or the clinician may be assisting the baby and seeing whether the baby can bear weight on its legs.

And they saw that there was no standing sub component score. And that is where the machine could say, Hey, here is where the standing, parts of the video are. And you don't have to look through a 30 minute video to find where the baby potentially could have been in the standing pose. That becomes a way for the machine to reduce the videos that you have to look at, for you to get a comprehensive evaluation.

Deep: I'm also just guessing that. Due to the limitation of time, the developmental expert might not get an opportunity to view some different baby poses in particular scenarios, I don't know.

Maybe the baby just refused to do something is that also some of the benefits of having this kind of long term passive asynchronous monitoring.

Bharath: The longer you monitor the baby, the better, but that also makes it impossible for a human to go through.

Like, so you have increased the complexity, and the requirements on the clinicians part to look at a, instead of looking at a video that's 15 minutes long, now Three hours of video gathered over one week. it's simply impossible for a human to look at all of that.

And that's where, AI comes in to eliminate non salient parts of it, and then just look for just a salient part. And if you find that the baby has not spent any time, in the sitting pose, for instance, now you have an opportunity to, send a notification via the app to the parent saying, Hey, we got all the information that we need.

We just need a little bit more information about the baby spending time in the sitting post. So it becomes a very targeted request, to the parent to complete the task, so you can get all the information you need to finish your evaluation.

Deep: what's been your like biggest challenge Maybe there's a technical one and a non technical one, but what's been the thing that sort of surprised you?

it's hard working with video, especially when you have very long time horizons, just the sheer compute power required to train up these models. And you're not even talking about, frame level only. Insights, you're talking about activity analysis, which is very much at the forefront of machine learning these days.

what's been the hardest on the technical problem front? And then maybe, same question on the non technical front.

Bharath: I think doing is at the intersection of technology like AI and the behavioral aspects which would call a behavioral tool used to assess, infant development.

finding that, the instruments themselves are quite complicated and it seemed like a daunting task. when we started this project do we have a product if we don't automate extraction of these nuanced, complicated activities from videos of persons that look like a blob, It seemed like an impossible task and our solution to those challenges have been to pare down the tool itself used to assess the infant's development. that took a lot of time. the generation data has been a long process.

as you can imagine, we have more data for the post estimation because we have a million plus frames of. data. each with 10 key points, Body joints. So we have a system that can detect the baby's pose and the body joins with a great degree of accuracy.

What we don't have is that many samples for activities even when you look at public open source databases on human activities. They're not as numerous as, samples for image classification, right? So getting these activity database, Has been challenging, but I think we are in a good place with several thousands of samples, of these motor activities.

But then when you look at specific activity, you can say for how many samples do we have for activity number 24 out of these 58 and for some activities, we may just see 50 samples because. Infants don't really do some of these activities or you you're not lucky enough to capture them often enough, on camera.

So that is the challenging part. generation of these samples requires expert, time and that is, expensive. Our innovation has been to, not rely on, complete soup to nuts automation, but, what we call augmented AI, where you have an expert in the loop, but you're just trying to make that, utilization of that expert's time efficient.

So their throughput can go up, right? And then also we innovate in terms of making things asynchronous so that, if you have, a pediatric occupational therapist, that's got an hour on a Saturday afternoon, they may be able to look at the queue of videos that have been processed by AI and whip through three infant assessments in a 30 minute, Time slot, right? So that, we can bring a lot of efficiencies there and build something that is, immediately usable in 2021, as opposed to waiting for complete automation in 2025.

Deep: So Barth, let's fast forward five years, even maybe 10 years out and talk to that new mother.

Who's got their first child. Who's nervous. about their child's development, what is your system going to give that mother and why is their life, better off, using your technology?

Bharath: it's a brilliant question, To be honest, we never really thought of this, when we started the project then we went through this, NIH program called innovation core, a bootcamp for entrepreneurs and you talk to, 100 plus stakeholders, and many of them are like people that have pain points that you're trying to relieve through your product, right?

it's not always the clinicians, mostly the parent. What our vision is to have passive observation of the infant at home, blending into the hectic lifestyle of a parent with a newborn And just be able to give them frequent information about the developmental trajectory of their child, but more importantly, give them something to do.

there are specific activities parents can do to engage with a child, even when they are less than three months old, in terms of verbal communication, social interaction and, you know, motor development. a lot of simple activities that parents know instinctively many of these, exercises or activities are something that parents may be, doing already, our vision is to observe the baby. And based on the activities and the milestones achieved or emerging skills of that specific infant have targeted, activity plans that the parent can participate in.

We started developing this idea and it's, early versions of it is on our website, early markers. com slash. It's called motor minutes. It's like real short bite sized videos, like 30 seconds or a minute that tells you like okay. So your baby is six months old and these are the skills you should already be seeing.

these are potentially emerging skills at this time. And activities you could do as a parent to, propel the baby along those trajectories, or to augment the baby's development. And if Milestones are delayed or being missed or, certain things are not happening.

a concrete example would be the amount of. tummy time that the baby spends on a per day, per week basis highly correlated to, future development. tummy time is a hard thing to do, for a lot of babies. And, hard for a parent to encourage because it's stressful,

If the baby's crying. there are some concrete, steps that babies can take, we're developing like these occupational therapy modules or play activities of what we call motor minutes that can target the parent and allow them to do these specific activities. our system can observe whether the baby's prone time has gone up,

So those are tangible measures. so you help the parent interact with the baby, augment their development and give them feedback.

Deep: to the, so

to the mom, you're basically saying, look, in five, 10 years in your nursery, you're going to have a developmental expert. The clinicians are going to be in there monitoring your baby and making sure you do just the right thing to optimize their, perfect health trajectory or something.

Right.

Bharath: And we of course rightfully focus on the video signals because that's our domain. But there are also audio methodologies for analyzing the baby's vocalizations and conversational turn taking between the infant and the parent. there are tools that can be used to, predict the risks for conditions like autism based on vocalization analysis, you don't have to be restricted to just visual analysis, if you have a home nursery solution, it can look at various signal modalities to, give feedback to the parent and the clinician.

Deep: Awesome. I think it's been fantastic having you, on, talking to us about early markers and some of the amazing work that you're doing to make sure kids grow up healthy thanks a ton for coming on.

Bharath: Thank you. It was awesome and exciting.

Deep: That's all for today. Barth, thanks so much

for chatting about the great work you guys are doing. And to our audience,

thanks so much for tuning in to your AI Injection.

If you're interested in reading more about the enabling

technology behind early markers, you can check out our website, xyonix.com.

That's X Y O N I X. Dot com slash solutions slash body dash pose dash analysis. And if you like what you're hearing, we're a relatively new podcast and really appreciate your leaving a review on the podcast platform you are listening to us on. Thanks so much.