RNA-Seq analysis ofnon-small cell lung malignant neoplastic diseasein female never-smokers reveals candidate cancer-associated long non-coding Ribonucleic acid
Runing rubric:Designation of lncRNAs related to NSCLC
- Immune response may be important mechanism involves in NSCLC patterned advance.
- LAT, LIME1, SLA2 and DEFB4A may be involved in NSCLC via immune response.
- Lnc-GGPS1, lnc-ZNF793 and lnc-STK4 may play functions in advancing NSCLC patterned advance.
- Down-regulated lnc-LOC284440, lnc-PPIEL, and lnc-ZNF461 may play functions in NSCLC.
Purpose:We aimed to clarify the possible mechanisms of long non-coding RNAs ( lncRNAs ) in the patterned advance of non-small cell lung malignant neoplastic disease ( NSCLC ) .
Methods:The microarray datasets of GSE37764, including 3 primary NSCLC tumours and 3 matched normal tissues isolated from 6 Korean female never-smokers, were downloaded from Gene Expression Omnibus database. The differentially expressed lncRNAs and messenger RNA in NSCLC samples were identified utilizing NOISeq bundle. Co-expression web of differentially expressed lncRNAs and messenger RNA was established. Gene Ontology ( GO ) and pathway enrichment analysis was severally performed. Finally, lncRNAs related to NSCLC were predicted by blaring the differentially expressed lncRNAs with all predicted lncRNAs related to NSCLC.
Consequences:Entire 182 and 539 differentially expressed lncRNAs and messenger RNA ( 109 up- and 73 down-regulated lncRNAs ; 307 up- and 232 down-regulated messenger RNA ) were severally identified. Among them, 4 up-regulated lncRNAs, like lnc-geranylgeranyl diphosphate synthase 1 ( GGPS1 ) , lnc-zinc finger protein 793 ( ZNF793 ) and lnc-serine/threonine kinase 4 ( STK4 ) , and 4 down-regulated lncRNAs including lnc-LOC284440 and lnc-peptidylprolyl isomerase E-like pseudogene ( PPIEL ) , and lnc-zinc finger protein 461 ( ZNF461 ) were predicted related to NSCLC. lncSSPS1, lnc-ZNF793 and lnc-STK4 were co-expressed with linker for activation of T cells ( LAT ) and Lck interacting transmembrane adapter 1 ( LIME1 ) . Lnc-LOC284440, lnc-PPIEL and lnc-ZNF461 were co-expressed with Src-like-adaptor 2 ( SLA2 ) and defensin beta 4A ( DEFB4A ) .
Decisions:Immune response may be important mechanism involves in NSCLC patterned advance. Lnc-GGPS1, lnc-ZNF793, lnc-STK4, lnc-LOC284440, lnc-PPIEL, and lnc-ZNF461 may be involved in immune response for advancing NSCLC patterned advance via co-expressing with LAT, LIME1, SLA2 and DEFB4A.
Keywords:non-small cell lung malignant neoplastic disease ( NSCLC ) ; long non-coding RNAs ( lncRNAs ) ; co-expression web ; Gene Ontology ( GO ) ; pathway enrichment analysis
Lung malignant neoplastic disease is the taking cause of cancer-related mortality around the universe [ 1 ] , in which non-small cell lung malignant neoplastic disease ( NSCLC ) histories for 80-85 % of all lung malignant neoplastic diseases [ 2 ] . Smoke is the chief cause of lung malignant neoplastic disease, nevertheless, prevalence of NSCLC in females never-smoker patients has been observed, peculiarly in Asiatic states [ 2, 3 ] . These epidemiological informations make non-smoking-associated lung malignant neoplastic disease going a distinguishable disease entity, where specific familial and molecular features of tumours are being recognized [ 2 ] . Despite the recent progresss in NSCLC therapies, the high mortality of NSCLC patients has non significantly decreased over the old ages [ 4 ] . Therefore, researching more effectual and safe intervention schemes is pressing, and it is of great importance to clarify the mechanisms involved in NSCLC at molecular degrees.
Recently, long non-coding RNAs ( lncRNAs ) are emerging as drivers of tumour suppressive and oncogenic maps in assorted prevalent malignant neoplastic diseases, such as lung malignant neoplastic disease [ 5, 6 ] . LncRNAs are mRNA-like transcripts runing in length from 200 National Trusts to 100 kilobits missing important unfastened reading frames, hence, they do non work as templets for protein synthesis [ 7, 8 ] . In malice of this, roll uping epidemiological surveies have suggested that misregulated lncRNA look may be a major subscriber to tumorigenesis across legion malignant neoplastic disease types [ 8, 9 ] . For illustration, the lncRNA metastasis associated lung glandular cancer transcript 1 ( MALAT1 ) is thought to heighten cell migration of NSCLS cellsin vitroby act uponing the look of motility-related cistrons [ 10, 11 ] . lncRNA HOTAIR is associated with short disease-free endurance in human NSCLC, and forced look of HOTAIR enhances lung malignant neoplastic disease cell growing and migration [ 12 ] . Knockdown of H19 look can impair lung malignant neoplastic disease cell growing and clonogenicity in theoretical account systemsin vitro[ 13 ] . Therefore, lncRNAshave been considered as cardinal regulators underlying assorted and are progressively going a new malignant neoplastic disease diagnostic and curative gold mine. However, several major lncRNAs related to NSCLC and their functions in the molecular pathogenesis of NSCLC remains ill-defined.
Significant progresss in following coevals sequencing engineerings have revolutionized omics and biomedical surveies, particularly in the field of malignant neoplastic disease research [ 14 ] . Deep sequencing techniques provide a comprehensive apprehension of malignant neoplastic disease patterned advance at the molecular degree [ 14 ] . In old survey, GSE37764 was used to research the DNA transcript figure fluctuations in female never-smoker patients with NSCLC for dissecting the molecular nature of NSCLC via integrating with array comparative genomic hybridisation ( array-CGH ) survey [ 15 ] . In the present survey, in order to clarify the functions of lncRNAs in NSCLC patterned advance, lncRNA profiling by high throughput sequencing ( RNA-seq ) was used to test the differentially expressed lncRNAs in female never-smokers with NSCLC. Then lncRNAs related to NSCLC and co-expressed messenger RNA were identified utilizing comprehensive bioinformatics attacks. Our survey will give new penetrations into the pathogenesis of NSCLC in female never-smokers.
Material and methods
Beginnings of informations
The array informations of GSE37764 [ 15 ] , including 3 primary NSCLC tumours and 3 matched normal tissues isolated from 6 Korean female never-smokers, was downloaded from Gene Expression Omnibus ( GEO ) database ( hypertext transfer protocol: //www.ncbi.nlm.nih.gov/geo/ ) , which was sequenced on Illumina Genome Analyzer IIx ( Homo sapiens ) platform. Sequencing scheme was paired-end reads and reads length was 78nt.
Raw read filtrating
The natural reads were foremost converted into the fastq format utilizing fastq-dump plan in sratoolkit [ 16 ] , so soiled natural reads was removed prior to analysing the information. Three standards were utilized to filtrate out soiled natural reads: Remove reads with sequence adapters ; Remove reads with more than 5 % ‘N’ bases ; Remove low-quality reads, which have more than 10 % QA ? 20 bases. Finally, clean reads were acquired for all subsequent analyses.
Sequence alliance and transcriptome assembly
The University of California Santa Cruz ( UCSC ) Genome Browser ( hypertext transfer protocol: //genome.ucsc.edu ) is an on-line public tool supplying entree to a turning database of genomic sequence and notes of assorted beings for visual image, comparing and analysis [ 17 ] . TopHat and Cufflinks [ 18 ] are open-source package tools for cistron find and comprehensive look analysis of high-throughput RNA sequencing ( RNAseq ) information.
Clean reads were aligned to the mention genome downloaded from the UCSC web site ( version hg19 ) utilizing bowtie1 in Tophat. The runtime parametric quantities of bowtie1 in the alliance for each read were sets as follows: — read-mismatches = 2, — mate-inner-dist = 77, the others run as default parametric quantities.
Harmonizing to the mention transcript note information in UCSC web site ( version hg19 ) , transcriptome of each read was assembled by Cufflinks. Then the assembled consequences of each read were merged utilizing cuffmerge in Cufflinks.
Prediction of lncRNAs
Step1, the assembled transcriptssmaller than 200 National Trusts were removed ;
Step2, the protein-coding potency of each transcript from step1 was assessed utilizing CPC [ 19 ] . Then the transcripts which do non encode proteins but map as lncRNAs were identified.
Step3, lncRNAs were besides screened via blaring transcriptsfrom step1 with human lncRNAs extracted from the NONCODEv4 [ 20 ] database utilizing blastn in BLAST ( hypertext transfer protocol: //blast.ncbi.nlm.nih.gov/Blast.cgi ) . The parametric quantities of blastn plan were sets as follows: Expectation value ( E ) [ Real ] = 20 ; -m = 8.
Step4, common lncRNAs from Step2 and Step3 were considered as predicted lncRNAs.
Designation of differentially expressed lncRNAs and messenger RNA
NOISeq [ 21 ] bundle was used to place differentially expressed lncRNAs and messenger RNA in primary NSCLC tumours compared to normal controls. q = 0.99 was considered as the cut-off value for testing.
Co-expression web building
The absolute value of Pearson correlativity coefficient was used as co-expression similarity step. Cytoscape is a unfastened package for visualising complex webs and incorporating these with any type of property informations. Therefore, the differentially co-expressed lncRNAs-mRNA braces with Pearson correlativity coefficient & A ; gt ; 0.85 were screened, so, the co-expression web of these braces was established utilizing Cytoscape.
Gene Ontology ( GO ) and pathway enrichment analysis
GO database is a aggregation of cistron note footings for large-scale genomic or transcriptomic informations. Database for Annotation Visualization and Integrated Discovery ( DAVID ) [ 22 ] is an on-line tool used for consistently associating the functional footings with big cistron or protein lists. We performed GO-BP ( biological procedure ) enrichment analysis of differentially co-expressed messenger RNA with lncRNAs utilizing DAVID online excessively. The p-value & A ; lt ; 0.05 was defined as the cutoff value.
Web-based Gene Set Analysis Toolkit ( WebGestalt ) is web-based popular softwarefor the efficient functional enrichment analysis of cistron lists derived from big scale genomic, transcriptomic, and proteomic studies.In this paper, pathway enrichment analysis of differentially co-expressed messenger RNA with lncRNAs was further performed by WebGestalt. The rawR & A ; lt ; 0.01 was set as the threshold value.
Prediction of lncRNAs related to NSCLC
First, the differentially expressed lncRNAs was screened based onhg19referencegenome information in UCSC web site once more.
Second, lncRNAs related to NSCL Cwere exacted from LncRNADisease. LncRNADisease [ 23 ] is publically accessible lncRNAs and disease association database, which collect and curate about 480 entries of lncRNA-disease associations by experiment proof, including 166 diseases.
Third, lncRNAs were obtained via blaring the above differentially expressed lncRNAs with lncRNAs related to NSCLC utilizing blastn in BLAST. The parametric quantities of blastn plan were as follows: Expectation value ( E ) [ Real ] = 10 ; -m = 8. Then lncRNAs were removed based on following standards: lncRNA was smaller than 200 National Trusts and blast similarity was less than 90.
Finally, the reacquired lncRNAs were considered as possible lncRNAs related to NSCLC.
Prediction of lncRNA
In our survey, entire 1282 predicted lncRNA sequences were obtained, thereinto, min sequence length of lncRNAs was 201 National Trust, max sequence length of lncRNAs was 9848 National Trust, and mean sequence length of lncRNAs was 843.81 National Trust.
Designation of differentially expressed lncRNAs and messenger RNA
Using NOISeq bundle with q = 0.99 as thresholds, we finally obtained 182 differentially expressed lncRNAs, including 109 up-regulated lncRNAs and 73 down-regulated 1s. In add-on, entire 539 differentially expressed messenger RNA ( 307 up- and 232 down-regulated ) was identified.
Prediction oflncRNAs related to NSCLC
By blaring the differentially expressed lncRNAs with all predicted lncRNAs related to NSCLC, we screened 8 differentially expressed lncRNAs related to NSCLC. Among them, lnc- geranylgeranyl diphosphate synthase 1 ( GGPS1 ) , lnc-zinc finger protein 793 ( ZNF793 ) , lnc-serine/threonine kinase 4 ( STK4 ) , and lnc-interferon regulative factor 1 ( IRF1 ) were up-regulated while lnc-myelin look factor 2 ( MYEF2 ) , lnc-LOC284440, lnc-peptidylprolyl isomerase E-like pseudogene ( PPIEL ) , and lnc-zinc finger protein 461 ( ZNF461 ) were down-regulated.
Co-expression web analysis
As shown in Figure 1, co-expression web analysis of differentially expressed lncRNAs and messenger RNA was established. The consequences showed that lncSSPS1, lnc-ZNF793 and lnc-STK4 were co-expressed with linker for activation of T cells ( LAT ) and Lck interacting transmembrane adapter 1 ( LIME1 ) , lnc-IRF1 was co-expressed with interferon-inducible guanylate binding protein 1 ( GBP1 ) and GBP2, lnc-MYEF2 was co-expressed with interleukin 20 ( IL20 ) and GLI household Zn finger 4 ( GLI4 ) , lnc-LOC284440, lnc-PPIEL and lnc-ZNF461 were co-expressed with Src-like-adaptor 2 ( SLA2 ) and defensin beta 4A ( DEFB4A ) .
GO and pathway enrichment analysis
We performed GO-BP and pathway enrichment analysis for differentially expressed messenger RNA. The overrepresent GO-BP footings were chiefly associated with immune response, placenta development and synapsis ( Table 1 ) . The significantly enriched tracts Interferon Signaling, LPA receptor mediated events, and Cytokine Signaling in Immune system ( Table 2 ) . Notably, messenger RNA, such as LAT, LIME1, SLA2, DEFB4A, GBP1 and GBP2, were significantly associated with the map of immune response. These messenger RNAs were co-expressed with lncRNAs related to NSCLC.
NSCLC ranks among the most diagnosed malignant neoplastic disease every bit good as deadly malignant diseases [ 1 ] . Although functional functions and dysregulation of lncRNA in malignant neoplastic disease development and patterned advance are get downing to be disclosed [ 7 ] , the research of mechanisms of action of lnRNAs in NSCLC remains at a preliminary degree. In the current survey, we utilized the comprehensive bioinformatics attacks to analysis the lncRNA profiling by high RNA-seq for screen the differentially expressed lncRNAs and co-expressed messenger RNA in female never-smokers with NSCLC. Strikingly, the up-regulated lncRNAs, including lnc-GGPS1, lnc-ZNF793 and lnc-STK4, and down-regulated lncRNAs, such as lnc-LOC284440, lnc-PPIEL, and lnc-ZNF461, were considered to be strongly related to NSCLC. In add-on, the mRNAs co-expressed these lncRNAs, such as LAT, LIME1, SLA2 and DEFB4A, were all significantly associated with the map of immune response.
Numerous groundss have suggested that the interactions between the host immune system and tumours are closed tied to the procedure of tumorigenesis, and intratumoral immune responses can foretell patient forecast with NSCLC [ 24 ] . Increased infiltration with CD4+/CD8+T cells and other antigen showing cells in NSCLC tissues is independently associated with improved survival [ 25-27 ] . Immune cells in the tumour microenvironment can interact closely with the transformed cells to advance oncogenesis actively [ 28 ] . Furthermore, Dougan et Al. reported that the changes in immune response cistrons could originate the development of lung malignant neoplastic disease in the absence of germline mutants in known transforming genes or tumour suppressers [ 29 ] . Therefore, the encirclement of the map of immune system may lend to NSCLC patterned advance, and the induction of immune response may be an of import index of NSCLC.
LAT is an transmembrane protein of 36–38 kd, which playsan of import function inT cell activation and immune receptor signaling [ 30, 31 ] . The binding of major histocompatibility complex-bound foreign peptides to T cell antigen receptors ( TCRs ) is a cardinal event in the ordinance of the adaptative immune response [ 32 ] . LAT may work as a span between TCR-initiated, T cell-specific signaling events and general signaling tracts [ 32 ] . In add-on, LIME 1 is a raft-associated transmembrane adapter phosphoprotein which is shown to be a organiser of immunoreceptor signaling [ 33 ] . It is expressed in preponderantly in T lymph cells and is found to mediates T cell activation [ 34 ] . The work of BrdiA?kovaet Al.revealed that LIME was involved in CD4 and CD8 coreceptor signaling [ 35 ] . Due to the of import maps of LAT and LIME1 in immune response and the parts immune response to NSCLC patterned advance, we speculate that LAT and LIME1 may play an of import function in advancing the NSCLC patterned advance via triping immune response. In our survey, Lnc-GGPS1, lnc-ZNF793 and lnc-STK4 were co-expressed with LAT and LIME1. Although the functions of these lncRNAs in NSCLC development have non been to the full discussed, we speculated Lnc-GGPS1, lnc-ZNF793 and lnc-STK4 may be involved in immune response for advancing NSCLC patterned advance via co-expressing with LAT and LIME1.
SLA2 is one of the Src-like adapter protein ( SLAP ) , and SLAP can down-regulates the T cell receptor on CD4 and CD8 thymocytes [ 36 ] . Previous survey besides reveals that SLAP is a negative regulator of T cell receptor signaling [ 37 ] . Furthermore, SLA2 portions amino acid and structural homology with SLAP, which has been confirmed to down-regulate TCR-mediated signal transduction [ 37, 38 ] . Additionally, DEFB4A belongs a member of defensins household which is involved in the first line of defence in their innate immune response against pathogens [ 39 ] . Enhanced look of DEFB4A is an grounds of a proinflammatory and/or innate immune response [ 40 ] . Therefore, SLA2 and DEFB4A may be cardinal molecules involved in immune response and drama of import functions in malignant neoplastic disease patterned advance while there are barely any researches about the functions of SLA2 and DEFB4A in NSCLC. Notably, down-regulated lncRNAs, such as lnc-LOC284440, lnc-PPIEL, and lnc-ZNF461 were co-expressed with SLA2 and DEFB4A. Therefore, our consequences further suggest that the above lncRNAs may lend to the development and patterned advance of NSCLC.
In decision, our findings suggest that immune response may be important mechanism involves in NSCLC patterned advance. Lnc-GGPS1, lnc-ZNF793, lnc-STK4, lnc-LOC284440, lnc-PPIEL, and lnc-ZNF461 may be involved in immune response for advancing NSCLC patterned advance via co-expressing with LAT, LIME1, SLA2 and DEFB4A. The present findings shed new visible radiation on the molecular mechanism of NSCLC and may take to fresh clinical applications in oncology. However, no experimental proof is the restriction of our survey. More plants are still needed to research the possible molecular mechanisms for diagnosing and intervention of NSCLC.