Back to PLS Help

beh pls error
cesopenko
Posted on 06/04/14 15:50:58
Number of posts: 6
cesopenko posts:

I am running a behavior PLS using command line with: datamat1: group 1 data, datamat2: group 2 data, and datamat3: genetic data for both groups (3 columns of binary data). The pls seems to run, but I get two errors that I'm not sure what to do with. The first error occurs before the bootstrapping and says: "For at least one behavior measure, the minimum unique values of resampled behavior does not exceed 50% of its total." The second error I get occurs on bootstraps 484-500 with the error being " Please check behavior data". 

Can anyone give me guidance on these errors?

Many thanks!

Replies:

Untitled Post

I'm Online
nlobaugh
Posted on 06/04/14 18:02:36
Number of posts: 229
nlobaugh replies:

these are checks we have put into the behavPLS module to check for behavioural data with minimal variance-

this most often happens with neuropsych data, where the range of scores isquite restricted - and is likely to occur with genetic data as well, where you have mostly 1's and 0's to indicate presenc/absence of the allele.

If there is not sufficient variability, the correlation cannot be computed.. We may have an alternate version that can better handle genetic data.. I need to see if it is available, and will get back to you.

n

 



Untitled Post
cesopenko
Posted on 06/04/14 18:16:57
Number of posts: 6
cesopenko replies:

Great! Thank you very much! So even though the pls finished and I have a significant LV, I shouldn't trust it?



Untitled Post

I'm Online
nlobaugh
Posted on 06/12/14 14:42:57
Number of posts: 229
nlobaugh replies:

Hi Carrie..

to help you figure out what is going on, take a look in the boot_results structure array.

You will find the following variables:

num_LowVariability_behav_boots:    display numbers of low  variability resampled hehavior data in bootstrap test
badbeh: display bad behav data that is caused by bad re-order (with 0 standard deviation) which will further cause divided by 0
countnewtotal:    count the new sample that is re-ordered for badbeh

Normally, you should have an empty ("[]") matrix for badbeh and a 0 for countnewtotal. Your error indicates that  zero-variance bootstrap samples were generated for all bootstrap samples for at least one gene. The program tried to generate a new bootstrap sample to accommodate. We've set an arbitrary number of times to try to get a non-zero-variance sample to be the number of bootstrap samples.  In most cases, the program won't have to sample that many times, but it is still good to look at countnewtotal if you get warnings. If countnewtotal is a high number relative to the number of bootstraps, you don't really have enough variance in the data, and should drop that gene from the analysis.

for genetic data,  you will likely see that num_LowVariability_behav_boots is equal to the number of bootstraps for each gene - This value is incremented if zero variance error occurs, but is also updated because this is just a quick check to see if the value on the "behav" side has at least 50% unique values, which  can be useful for ordinal data with a small range. With genetic data,  given that the presence of some genetic variants can be low, this warning will most likely always be generated. 

I would suggest taking a look at the distributions of your gene variants to make sure you have reasonable representation of the variants.  There is no rule, but in the papers we've published so far, we were successful with at least a 30/70 split - other genes had 40/60 split or 50/50 split across the sample

cheers,

nancy

 



Untitled Post
cesopenko
Posted on 06/13/14 15:35:19
Number of posts: 6
cesopenko replies:

quote:

Hi Carrie..

to help you figure out what is going on, take a look in the boot_results structure array.

You will find the following variables:

num_LowVariability_behav_boots:    display numbers of low  variability resampled hehavior data in bootstrap test
badbeh: display bad behav data that is caused by bad re-order (with 0 standard deviation) which will further cause divided by 0
countnewtotal:    count the new sample that is re-ordered for badbeh

Normally, you should have an empty ("[]") matrix for badbeh and a 0 for countnewtotal. Your error indicates that  zero-variance bootstrap samples were generated for all bootstrap samples for at least one gene. The program tried to generate a new bootstrap sample to accommodate. We've set an arbitrary number of times to try to get a non-zero-variance sample to be the number of bootstrap samples.  In most cases, the program won't have to sample that many times, but it is still good to look at countnewtotal if you get warnings. If countnewtotal is a high number relative to the number of bootstraps, you don't really have enough variance in the data, and should drop that gene from the analysis.

for genetic data,  you will likely see that num_LowVariability_behav_boots is equal to the number of bootstraps for each gene - This value is incremented if zero variance error occurs, but is also updated because this is just a quick check to see if the value on the "behav" side has at least 50% unique values, which  can be useful for ordinal data with a small range. With genetic data,  given that the presence of some genetic variants can be low, this warning will most likely always be generated. 

I would suggest taking a look at the distributions of your gene variants to make sure you have reasonable representation of the variants.  There is no rule, but in the papers we've published so far, we were successful with at least a 30/70 split - other genes had 40/60 split or 50/50 split across the sample

cheers,

nancy

 

Thank you! We've decided to run the genetic data as a task pls... However, I'm having this error for my neuropsyc data too. For the most part I have 0's in the low variability matrix, but for a few I have numbers anywhere from 3-500. The pls is still able to complete all permutations and bootstraps, but I'm getting that less than 50% variability error. Does this mean I should be getting rid of some of my neuropsyc tests?

Thanks!

Carrie



Untitled Post
cesopenko
Posted on 06/13/14 15:37:58
Number of posts: 6
cesopenko replies:

quote:

Thank you! We've decided to run the genetic data as a task pls... However, I'm having this error for my neuropsyc data too. For the most part I have 0's in the low variability matrix, but for a few I have numbers anywhere from 3-500. The pls is still able to complete all permutations and bootstraps, but I'm getting that less than 50% variability error. Does this mean I should be getting rid of some of my neuropsyc tests?

Thanks!

Carrie

Or is this expected given that some of the neuropsyc variables I have have a low range?

Thanks!! 



Untitled Post

I'm Online
nlobaugh
Posted on 06/13/14 16:01:02
Number of posts: 229
nlobaugh replies:

Carrie.. those are "informational warnings" only - it is up to you to decide if the distribution of neuropsych scores meets the requirements for correlation analysis..  We chose a conservative 50% unique rule.  If the bootstrap and original distributions are biased/skewed, you will be less likely to detect meaningful effects.

cheers

Nancy




Login to reply to this topic.

  • Keep in touch

Enter your email above to receive electronic messages from Baycrest, including invitations to programs and events, newsletters, updates and other communications.
You can unsubscribe at any time.
Please refer to our Privacy Policy or contact us for more details.

  • Follow us on social
  • Facebook
  • Instagram
  • Linkedin
  • Pinterest
  • Twitter
  • YouTube

Contact Us:

3560 Bathurst Street
Toronto, Ontario
Canada M6A 2E1
Phone: (416) 785-2500

Baycrest is an academic health sciences centre fully affiliated with the University of Toronto